Systems, models and methods for identifying and evaluating skin-active agents effective for treating dandruff/seborrheic dermatitis

ABSTRACT

Methods and systems for determining functional relationships between a skin-active agent and a skin condition of interest, and methods and systems for identifying cosmetic agents effective for treatment of dandruff, as well as the use of agents identified by such methods and systems for the preparation of cosmetic compositions, personal care products, or both are provided. Methods for developing in vitro models of skin disease and models for specific skin diseases are also provided.

PRIORITY

This application claims priority under 35 U.S.C. § 119(e) to U.S.Provisional Applications 61/470,131 filed on Mar. 31, 2011, 61/488,501filed on May 20, 2011, and 61/519,504, filed on May 24, 2011, the entiredisclosures of which are incorporated herein by this reference.

BACKGROUND OF THE INVENTION

Dandruff, alternatively referred to in the literature as ptyriasissimplex, furfuracea or capitis, is a skin disorder characterized byflaking, itching and microinflammation. By definition, dandruff isconfined to the scalp, and it is experienced by about half of the adultpopulation irrespective of ethnicity and gender. Dandruff hasconventionally been considered trivial from a medical standpoint, butrepresents a persistent cosmetic concern.

Dandruff is considered to be a form of seborrheic dermatitis, which maybe found in other locations on the body. Scalp scaling, which is presentin both disorders, is not the clinical distinguishing feature. Rather,inflammation and presence of lesions outside the scalp exclude thediagnosis of dandruff. Generally, dandruff is considered the mildestform of seborrheic dermatitis from a clinical perspective sinceinflammation is minimal and typically subclinical. In fact, as recentlyas a decade ago the dandruff condition was thought to benon-inflammatory. The term “dandruff” is also used to describe thesymptomatic scale itself, which has conventionally been considered assubject to cosmetic management. Response to currently available cosmetictreatments, however, is often transient. In contrast, seborrheicdermatitis is a more inflammatory disease and medical treatment is oftenundertaken to clear the disease, although cosmetics may also demonstratesome efficacy in reducing symptoms.

Dandruff scale comprises a cluster of corneocytes which have retained alarge degree of cohesion with one another and which become detached assuch from the surface of the stratum corneum. The size and degree ofscaling is inconsistent across the region of the scalp and as a functionof time. One hypothesis of dandruff formation relates to a suppressionof the lipases responsible for proper shedding of corneocytes at theskin surface.

The pathogenesis of dandruff is complex, and appears to be the result ofinteractions between scalp skin, cutaneous microflora and the cutaneousimmune system. The key clinical features of dandruff include flaking anditch, and although much descriptive work has been done, the preciseunderlying events that provoke these symptoms are incompletelyunderstood. Dandruff is considered to have multiple sometimesoverlapping causes with numerous pathogenic pathways and complexmechanisms. A microbial flora is implicated in the most common forms ofdandruff. Generally, a healthy normal scalp is known to harbor manymicrobes reaching a density of 10³ to 10⁵ organisms per mm² andincluding, for example, Staphylococci, Propionibacterium spp. andMalassezia spp.

The theory that dandruff is fundamentally a Malassezia-based disorderremains prevalent. However, as the clinical evidence implicatingMalassezia in dandruff conditions and seborrheic dermatitis hasaccumulated over time, it has been observed that results obtained fromquantitative methods used to count yeasts are inconsistent and do notcorrelate. For example, no relationship has been found between severityof symptoms and fungal count. It is hypothesized that only yeasts thatare tightly bound to the skin correlate to the dandruff condition.Further perplexing however is that quantitative microbiologicalassessments fail to implicate a role specifically for yeast. Someinvestigators attribute this to a failure to note the relationshipbetween dandruff and particular species of Malassezia. Regardless of theinability to account for the inconsistent empirical observations,controlling yeast of the genus Malassezia has heretofore been the mostsuccessful dandruff management strategy. Application of antifungal-basedantidandruff shampoos generally results in lessening or disappearingitch after only a few applications. A problem with this strategy remainsin that although anti-fungal management results in a decrease in theyeasts, complete eradication is rarely achieved. Scale productionreduces in parallel to microbe reduction, yet within 2 to 3 weeks ofceasing treatment, dandruff recurs and Malassezia populations increaseto their pre-treatment levels.

Close inspection reveals that the Malassezia yeasts appear in scatteredclumps restricted in distribution over some corneocytes but not others.It has been hypothesized that dandruff represents a failure of a normalimmune response by the specific keratinocytes where Malassezia yeastsare found. Other investigators note that Malassezia have antigenic andpro-inflammatory properties stimulating both the innate and acquiredimmune responses. Anti-inflammatory drugs such as dermocorticoids haveproven efficacy, particularly in severe dandruff. Nonetheless, providingan anti-fungal active remains the conventional treatment of choice.

A further confounding problem in determining the causative basis ofdandruff is the mode of action of the anti-fungal agents themselves.Most of the known actives such as zinc pyrithione (ZPT), a biocidewidely recognized as an effective anti-fungal agent in shampooformulations, are substantially insoluble in water so that sustainedcontact time with the scalp is very brief. Many investigators thereforeposit that the anti-fungal agents exhibit efficacy through someancillary mode of action including some direct biological effects onepidermal cells.

Scaling conditions similar to dandruff may occur with desquamation ofthe scalp following excessive exposure to sunlight where intercorneocytecohesion is also affected, as well as in minor chronic irritation of thescalp. Further, over-brushing, over-shampooing, certain cosmetic hairproducts, and irritation from airborne substances may cause scaling.Other non-fungal causes include use of sebum-derived products, sunlightactivation of follicular-photosensitizing agents such as porphyrins, andsome neuro-immune conditions. Psychological stress is also widelyconsidered to exacerbate dandruff.

Sebum has been found to be a prerequisite for dandruff, but not asufficient factor per se. Many people who complain about oily scalp haveno dandruff, while successful treatment of dandruff often leads to anincreased coating of the hair shafts by sebum. Epidermal lipids exhibitdifferences across the dandruff condition both in quantity and quality.In particular, it has been demonstrated that the three main classes ofstratum corneum lipids, i.e. ceramides, free fatty acids andcholesterol, display diminished content in dandruff-affected relative tohealthy scalp skin.

Dandruff severity ranges from mild and discrete to severe and pervasiveamong affected individuals. Amount of scalp hair is a factor, althoughamounts of dandruff on the scalp and on hair are not always correlated.Products used to treat dandruff have been observed to suppressandrogenic alopecia, and ketoconazole has been reported to stimulatehair growth in mice, among many other effects.

Dandruff, therefore, represents a reactive response of the epidermis ofthe scalp to various stimuli, some of which may be external and some ofwhich may be internal, in combination with an individual predisposition,and its etiological complexity makes it a treatment challenge. There isa persistent need in the art for methods of identifying potentialanti-dandruff agents and for evaluating the efficacy of putative agentshaving efficacy substantially independent of mechanism of action oretiology of the dandruff condition. The present investigators thereforeundertook an investigation into the application of “connectivitymapping” to the search for new skin-active agents with efficacy in thetreatment of dandruff and related skin conditions.

Connectivity mapping is a well-known hypothesis generating and testingtool having successful application in the fields of operations research,telecommunications, and more recently in pharmaceutical drug discovery.The undertaking and completion of the Human Genome Project, and theparallel development of very high throughput high-density DNA microarraytechnologies enabling rapid and simultaneous quantification of cellularmRNA expression levels, resulted in the generation of an enormous amountof gene expression data. At the same time, the search for newpharmaceutical actives via in silico methods such as molecular modelingand docking studies stimulated the generation of vast libraries ofpotential small molecule actives. The amount of information linkingdisease to gene expression profiles, gene expression profiles to drugs,and disease to drugs grew exponentially, and application of connectivitymapping as a hypothesis testing tool in the medicinal sciences ripened.

The general notion that functionality could be accurately determined forpreviously uncharacterized genes, and that potential targets of drugagents could be identified by mapping connections in a data base of geneexpression profiles for drug-treated cells, was spearheaded in 2000 withpublication of a seminal paper by T. R. Hughes et al. [“Functionaldiscovery via a compendium of expression profiles” Cell 102, 109-126(2000)], followed shortly thereafter with the launch of The ConnectivityMap (—map Project by Justin Lamb and researchers at MIT (“ConnectivityMap: Gene Expression Signatures to Connect Small Molecules, Genes, andDisease”, Science, Vol 313, 2006.) In 2006, Lamb's group beganpublishing a detailed synopsis of the mechanics of C-map constructionand installments of the reference collection of gene expression profilesused to create the first generation C-map and the initiation of anon-going large scale community C-map project, which is available underthe “supporting materials” hyperlink athttp://www.sciencemag.org/content/313/5795/1929/suppl/DC1.

The basic paradigm of predicting novel relationships between disease,disease phenotype, and drugs employed to modify the disease phenotype,by comparison to known relationships has been practiced for centuries asan intuitive science by medical clinicians. Modern connectivity mapping,with its rigorous mathematical underpinnings and aided by moderncomputational power, has resulted in confirmed medical successes withidentification of new agents for the treatment of various diseasesincluding cancer. Nonetheless, certain limiting presumptions challengeapplication of C-map with respect to diseases of polygenic origin orsyndromic conditions characterized by diverse and often apparentlyunrelated cellular phenotypic manifestations. According to Lamb, thechallenge to constructing a useful C-map is in the selection of inputreference data which permit generation of clinically salient and usefuloutput upon query. For the drug-related C-map of Lamb, strongassociations comprise the reference associations, and strongassociations are the desired output identified as hits.

Noting the benefit of high-throughput, high density profiling platformswhich permit automated amplification, labeling hybridization andscanning of 96 samples in parallel a day, Lamb nonetheless cautioned:“[e]ven this much firepower is insufficient to enable the analysis ofevery one of the estimated 200 different cell types exposed to everyknown perturbagen at every possible concentration for every possibleduration . . . compromises are therefore required,” Lamb, J. (2007) “TheConnectivity Map: a new tool for biomedical research” Nat. Rev. Cancer7, 54-60, (page 54, column 3, last paragraph). Lamb, however, took theposition that cell type did not ultimately matter, and confined hisC-map to data from a very small number of established cell lines out ofefficiency and standardization concerns. Theoretically this leads toheightened potential for in vitro to in vivo mismatch, and limits outputinformation to the context of a particular cell line. If one accepts theLamb precept that cell line does not matter then this limitation may bebenign.

However, agents suitable as pharmaceutical agents and agents suitable ascosmetic agents are categorically distinct, with the former definingagents selected for specificity and which are intended to havemeasurable effects on structure and function of the body, while thelatter are selected for effect on appearance and may not effectstructure and function of the body to a measurable degree. Cosmeticagents tend to be non-specific with respect to effect on cellularphenotype, and administration to the body is generally limited toapplication on or close to the body surface.

In constructing C-maps relating to pharmaceutical agents, Lamb stressesthat particular difficulty is encountered if reference connections areextremely sensitive and at the same time difficult to detect (weak), andLamb adopted compromises aimed at minimizing numerous, diffuseassociations. Since the regulatory scheme for drug products requireshigh degrees of specificity between a purported drug agent and diseasestate, and modulation of disease by impacting a single protein with aminimum of tangential associations is desired in development ofpharmaceutical actives, the Lamb C-map is well-suited for screening forpotential pharmaceutical agents despite the noted compromises.

The connectivity mapping protocols of Lamb would not be predicted,therefore, to have utility for hypothesis testing/generating in thefield of cosmetics. Cosmetic formulators seek agents or compositions ofagents capable of modulating multiple targets and having effects acrosscomplex phenotypes and conditions. Further, the phenotypic impact of acosmetic agent must be relatively low by definition, so that the agentavoids being subject to the regulatory scheme for pharmaceuticalactives. Nonetheless, the impact must be perceptible to the consumer andpreferably empirically confirmable by scientific methods. Genetranscription/expression profiles for cosmetic conditions are generallydiffuse, comprising many genes with low to moderate fold differentials.Cosmetic agents, therefore, provide more diverse and less acute effectson cellular phenotype and generate the sort of associations expresslytaught by Lamb as unsuitable for generating connectivity maps useful forconfident hypothesis testing.

Nonetheless, contrary to the teachings of Lamb and the prior art ingeneral, the present inventors surprisingly discovered that usefulconnectivity maps could be developed from cosmetic active—cellularphenotype—gene expression data associations in particular with respectto skin-care actives and cosmetic agents, despite the highly diffuse,systemic and low-level effects these sorts of actives generallyengender. Further, contrary to assertions by the Lamb team that resultsshould be substantially independent of cell-type, the present inventionis based in part on the surprising discovery that selection of humanepidermal keratinocyte cells as the relevant cell line resulted inconstruction of connectivity maps particularly useful for hypothesisgenerating and testing relating to skin-active agents and cosmeticagents useful in the treatment of dandruff.

As noted above, the dandruff condition is particularly complex and itsetiology is not fully understood. The present investigators thereforemade a novel adaptation to the C-map paradigm that has proven to beparticularly useful in identifying agents with potential efficacy incertain skin diseases, including dandruff. Although gene expressionsignatures are determined for the skin condition, the gene expressionsignature is further analyzed to determine an implicated biologicalprocess pattern which is used to derive a physiological thematicexpression signature. The theme signature is then used to query theC-map data base to generate a skin-active agent output where highlynegative connectivity to the skin condition thematic expressionsignature predicts efficacy for treatment of the skin condition. To thebest knowledge of the present investigators, application of connectivitymapping to target a multi-factored, poorly delineated and low-level“disease” condition such as dandruff, by identifying agents through theuse of physiological theme expression signatures has not been attemptedpreviously.

The present investigators further discovered that a well-designedconnectivity map may provide insights into the pathogenesis of the skincondition and the mechanism of action of benchmark actives. Byapplication of C-map, the present inventors surprisingly discovered, forexample, modes of action for anti-fungal agents that are independent ofanti-fungal properties. Further, by conducting the transcriptionalprofiling analyses as part of the C-map process, the present inventorssurprisingly discovered that by inspecting a gene expression signaturefor biological process themes, in vitro models of disease states couldbe constructed with a surprisingly high fidelity to the clinical diseasestate with respect to response of the gene expression profile tospecific skin-active agents.

Successful identification of anti-dandruff agents has proven to bedifficult due to the multi-cellular, multi-factorial processes involvedin etiology of the dandruff condition itself. Conventional in vitrostudies of biological responses to potential anti-dandruff agents can behindered by the complex or weakly detectable responses typically inducedand/or caused by the putative or potential agents. Such weak responsesarise, in part, due to the great number of genes and gene productsinvolved, and skin-active and cosmetic agents may affect multiple genesin multiple ways. Moreover, the degree of bioactivity of cosmetic agentsmay differ for each gene and be difficult to quantify.

The value of a connectivity map approach to discover functionalconnections among cosmetic phenotypes such as aged skin, gene expressionperturbation, and cosmetic agent action is counter-indicated by theprogenitors of the drug-based C-map. The relevant phenotypes are verycomplex, the genetic perturbations are numerous and weak, and cosmeticagent action is likewise diffuse and by definition, relatively weak. Itwas considered unlikely that statistically valid data could be generatedfrom cosmetic C-maps and it was unclear whether a cell line existedwhich could provide salient or detectable cosmetic data.

SUMMARY OF THE INVENTION

Surprisingly, the present inventors have developed a C-map approach tothe discovery of skin-active agents having efficacy for particular skindisorders such as dandruff, and which may also be useful for revealinginsights into the pathogenesis of the disease and mechanism of action ofselected agents.

Accordingly, the present invention provides novel methods, systems andmodels useful for generating potential new skin-active agentsefficacious for the treatment of skin conditions such as dandruff.Through careful selection of cell type, and by generation of a referencecollection of gene-expression profiles for known skin-active agents andrecognized skin disorders, along with determination of physiologicaltheme expression signatures, the present inventors were surprisinglyable to create a connectivity map architecture useful for testing andgenerating hypotheses about skin-active agents and skin disorders. Thepresent investigators further applied the novel connectivity mapprotocol to develop an in vitro model of a skin disorder which may beused to test putative or potential skin-active agents and/or toinvestigate the functional mechanism of a known active.

The present invention provides embodiments which broadly include methodsand systems for determining relationships between a skincondition/disorder of interest and one or more skin-active agents, oneor more genes associated with the skin disorder condition, andphysiological themes implicated by the skin condition and/or affected bya skin-active agent. The inventive methods may be used to identifyskin-active agents without detailed knowledge of the mechanisms ofbiological processes associated with a skin disorder or condition ofinterest, all of the genes associated with such a condition, or the celltypes associated with such a condition.

According to one embodiment of the invention, a method for constructinga data architecture for use in identifying connections betweenperturbagens and genes associated with one or more skin conditions isprovided. The method comprises: (a) providing a gene expression profilefor a control human epidermal keratinocyte cell; (b) generating a geneexpression profile for a human epidermal keratinocyte cell exposed to atleast one perturbagen; (c) identifying genes differentially expressed inresponse to the at least one perturbagen by comparing the geneexpression profiles of (a) and (b); (d) creating an ordered listcomprising identifiers representing the differentially expressed genes,wherein the identifiers are ordered according to the differentialexpression of the genes; (e) storing the ordered list as a keratinocyteinstance on at least one computer readable medium; and (f) constructinga data architecture of stored keratinocyte instances by repeating (a)through (e), wherein the at least one perturbagen of step (a) isdifferent for each keratinocyte instance. According to anotherembodiment, a method for generating a gene expression signature for usein identifying connections between perturbagens and genes associatedwith a dandruff condition is provided. The method comprises: (a)providing a gene expression profile for a reference sample of humanscalp skin cells; (b) generating a gene expression profile for at leastone sample of human scalp skin cells from a subject exhibiting adandruff condition, (c) comparing the expression profiles of (a) and (b)to determine a gene expression signature comprising a set of genesdifferentially expressed in (a) and (b); (d) assigning an identifier toeach gene constituting the gene expression signature and ordering theidentifiers according to the direction of differential expression tocreate one or more gene expression signature lists; and (e) storing theone or more gene expression signature lists on at least one computerreadable medium.

The inventive data architecture may be provided on a computer readablemedium. The computer readable medium comprises a first digital filestored in a spreadsheet file format, a word processing file format, or adatabase file format suitable to be read by a respective spreadsheet,word processing, or database computer program, the first digital filecomprising data arranged to provide one or more gene expressionsignature lists comprising a plurality of identifiers when read by therespective spreadsheet, word processing, or database computer program;and wherein each identifier is selected from the group consisting of amicroarray probe set ID, a human gene name, a human gene symbol, andcombinations thereof representing a gene set identified as regulated inthe gene expression signature, and wherein the gene expression signaturelist comprises between about 50 and about 600 identifiers.

A further embodiment is directed to a method for identifying askin-active agent having predictable efficacy in treatment of a skincondition. The method comprises: a. determining a gene expressionsignature for the skin condition wherein the gene expression signaturecomprises genes significantly up- and down-regulated in a skin sampleaffected with the skin condition when compared to skin not affected withthe skin condition; b. determining a thematic expression signature forthe skin condition by mapping the gene expression signature on abiological processes grid, such as the Gene Ontology, to determine oneor more regulated processes, wherein a theme expression signaturereflects statistical clustering of the regulated processes; c. providinga connectivity map data architecture according to the invention; d.querying the connectivity map with the thematic expression signaturedetermined in (b) to generate an output of skin-active agents; and e.rank-ordering the output by connectivity score wherein a negativeconnectivity score predicts efficacy of a skin-active agent for thetreatment of the skin condition.

In vitro models of a skin disease and methods for constructing them arealso disclosed. The models are useful for evaluating clinical efficacyof proposed therapeutic agents in treatment of the skin disease. Themethod comprises: a. determining a gene expression signature for thedisease state wherein the gene expression signature comprises genessignificantly up and down regulated in the disease; b. conducting abiological process analysis of the regulated genes to identifybiological processes implicated by the regulated genes; c. treating askin culture to simulate the biological processes identified in (b); d.confirming validity of the in vitro model of the skin disease bydetermining the gene expression signature for the treated skin cultureand assessing the degree to which it mimics the gene signaturedetermined in (a).

In other aspects, the invention provides inventive gene expressionsignatures which may exist tangibly in various forms known in the art.For example, a gene expression signature may exist as a set ofimmobilized oligonucleotides wherein each oligonucleotide uniquelyhybridizes to a nucleotide sequence identifying a region of a gene inthe signature. It is understood that the “genes set forth” in a tablerefers to gene identifiers designating the genes, and that a geneexpression signature as set forth herein is set forth according to agene identifier.

These and additional objects, embodiments, and aspects of the inventionwill become apparent by reference to the Figures and DetailedDescription below.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1A and FIG. 1B sets forth the genes constituting two different geneexpression signatures for dandruff-affected skin. Table B includes the70 most significantly up- and down-regulated genes as the dandruff geneexpression signature, and Table A sets forth the most significantly up-and down-regulated genes in the gene clusters (Lipid Metabolism andImmune Function) defining a thematic profile for dandruff.

FIG. 2 is a schematic illustration of a computer system suitable for usewith the present invention;

FIG. 3 is a schematic illustration of an instance associated with acomputer readable medium of the computer system of FIG. 2;

FIG. 4 is a schematic illustration of a programmable computer suitablefor use with the present invention;

FIG. 5 is a schematic illustration of an exemplary system for generatingan instance;

FIG. 6 is a schematic illustration of a comparison between a geneexpression signature and an instance, wherein there is a positivecorrelation between the lists;

FIG. 7 is a schematic illustration of a comparison between a geneexpression signature and an instance, wherein there is a negativecorrelation between the lists; and

FIG. 8 is a schematic illustration of a comparison between a geneexpression signature and an instance, wherein there is a neutralcorrelation between the lists.

FIG. 9 includes a table showing a broad-pattern physiological theme geneexpression pattern for Dandruff vs. non-dandruff.

FIG. 10 depicts a heat map showing differential gene expression andtheme analysis for dandruff versus non-dandruff affected skin.

FIG. 11 depicts a heat map of the average normalized expression valuesof the significantly regulated genes in dandruff-involved, dandruffuninvolved and normal scalp skin versus biological processes in GeneOntology.

FIG. 12 depicts a heat map of differential gene expression in dandruffand non-dandruff conditions, specifically highlighting the inversethematic relationship between lipid metabolism and Immune/inflammation.

FIG. 13 depicts a heat map of differential gene expression betweendandruff-affected, dandruff-uninvolved and non-dandruff conditions asgroup averages versus the lipid metabolism and immune/inflammatorythematic clusters.

FIG. 14 illustrates the differential expression of genes involved inbarrier lipid production, specifically the fatty acid synthetic pathway.

FIG. 15 illustrates the differential expression of genes involved inbarrier lipid production, specifically the cholesterol syntheticpathway.

FIG. 16 illustrates the differential expression of genes involved inbarrier lipid production, specifically the sphingolipid syntheticpathway.

FIG. 17A and FIG. 17B sets forth data on barrier lipid and inflammatorybiomarkers to illustrate phenotypic support for the transcriptomicfindings.

FIG. 18 includes a table and summarizes the transcriptomics study designfor analysis of the mechanism of ZPT according to an inventive method.

FIG. 19 depicts a heat map of ZPT treatment results demonstrating thattreatment with ZPT results in a profile shift toward healthyscalp/homeostatic equilibrium.

FIG. 20 depicts a heat map showing the effect of ZPT on thephysiological thematic signature for dandruff.

FIG. 21A and FIG. 21B illustrates that a high fidelity in vitro model ofdandruff may be constructed by treating a skin culture with acombination of IL17 and IL22.

DETAILED DESCRIPTION OF THE INVENTION

The present invention will now be described with occasional reference tothe specific embodiments of the invention. This invention may, however,be embodied in different forms and should not be construed as limited tothe embodiments set forth herein. Rather, these embodiments are providedso that this disclosure will be thorough and complete, and to fullyconvey the scope of the invention to those skilled in the art.

Unless otherwise defined, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which this invention pertains. The terminology used in thedescription of the invention herein is for describing particularembodiments only and is not intended to be limiting of the invention. Asused in the description of the invention and the appended claims, thesingular forms “a,” “an,” and “the” are intended to include the pluralforms as well, unless the context clearly indicates otherwise.

As used interchangeably herein, the terms “connectivity map” and “C-map”refer broadly to devices, systems, articles of manufacture, andmethodologies for identifying relationships between cellular phenotypesor cosmetic conditions, gene expression, and perturbagens, such ascosmetic actives.

As used herein, the term “cosmetic agent” means any substance, as wellas any component thereof, intended to be rubbed, poured, sprinkled,sprayed, introduced into, or otherwise applied to a mammalian body orany part thereof for purposes of cleansing, beautifying, promotingattractiveness, altering the appearance, or combinations thereof.Cosmetic agents may include substances that are Generally Recognized asSafe (GRAS) by the US Food and Drug Administration, food additives, andmaterials used in non-cosmetic consumer products includingover-the-counter medications. In some embodiments, cosmetic agents maybe incorporated in a cosmetic composition comprising a dermatologicallyacceptable carrier suitable for topical application to skin. A cosmeticagent includes, but is not limited to, (i) chemicals, compounds, smallor large molecules, extracts, formulations, or combinations thereof thatare known to induce or cause at least one effect (positive or negative)on skin tissue; (ii) chemicals, compounds, small molecules, extracts,formulations, or combinations thereof that are known to induce or causeat least one effect (positive or negative) on skin tissue and arediscovered, using the provided methods and systems, to induce or causeat least one previously unknown effect (positive or negative) on theskin tissue; and (iii) chemicals, compounds, small molecules, extracts,formulations, or combinations thereof that are not known have an effecton skin tissue and are discovered, using the provided methods andsystems, to induce or cause an effect on skin tissue.

Some examples of cosmetic agents or cosmetically actionable materialscan be found in: the PubChem database associated with the NationalInstitutes of Health, USA (http://pubchem.ncbi.nlm.nih.gov); theIngredient Database of the Personal Care Products Council(http://online.personalcarecouncil.org/jsp/Home.jsp); and the 2010International Cosmetic Ingredient Dictionary and Handbook, 13^(th)Edition, published by The Personal Care Products Council; the EUCosmetic Ingredients and Substances list; the Japan Cosmetic IngredientsList; the Personal Care Products Council, the SkinDeep database (URL:http://www.cosmeticsdatabase.com); the FDA Approved Excipients List; theFDA OTC List; the Japan Quasi Drug List; the US FDA Everything Added toFood database; EU Food Additive list; Japan Existing Food Additives,Flavor GRAS list; US FDA Select Committee on GRAS Substances; USHousehold Products Database; the Global New Products Database (GNPD)Personal Care, Health Care, Food/Drink/Pet and Household database (URL:http://www.gnpd.com); and from suppliers of cosmetic ingredients andbotanicals.

Other non-limiting examples of cosmetic agents include botanicals (whichmay be derived from one or more of a root, stem bark, leaf, seed orfruit of a plant). Some botanicals may be extracted from a plant biomass(e.g., root, stem, bark, leaf, etc.) using one more solvents. Botanicalsmay comprise a complex mixture of compounds and lack a distinct activeingredient. Another category of cosmetic agents are vitamin compoundsand derivatives and combinations thereof, such as a vitamin B3 compound,a vitamin B5 compound, a vitamin B6 compound, a vitamin B9 compound, avitamin A compound, a vitamin C compound, a vitamin E compound, andderivatives and combinations thereof (e.g., retinol, retinyl esters,niacinamide, folic acid, panethenol, ascorbic acid, tocopherol, andtocopherol acetate). Other non-limiting examples of cosmetic agentsinclude sugar amines, phytosterols, hexamidine, hydroxy acids,ceramides, amino acids, and polyols.

As used herein, the term “skin-active agent” is a subset of cosmeticagents as defined herein and includes generally any substance, as wellas any component thereof, intended to be applied to the skin for thepurpose of effectuating a treatment of an undesirable skin condition,for example, dandruff, seborrheic dermatitis, atopic dermatitis, rash,acne, or other condition that may be of substantially cosmetic concern.Categorical examples of skin-active agents include anti-dandruffactives, steroidal anti-inflammatory agents, non-steroidalanti-inflammatory agents, pediculocides, sensates, enzymes, vitamins,hair growth actives, sunscreens, and combinations thereof. Cosmeticcompositions according to the instant invention may contain skin-activeagents.

A specific category of skin-active agent is an anti-dandruff agent.Anti-dandruff agents known in the art include an antimicrobialanti-dandruff active, concentrations of which within the compositionsrange from about 0.001% to about 5%, more preferably from about 0.01% toabout 3%, even more preferably from about 0.05% to about 1%, by weightof the composition. Specific examples of antimicrobial anti-dandruffactives include antifungal actives such as pyrithione salts, octopirox,ketoconazole, climbazole, ciclopirox, terbinafine, and sulfur orsulfur-containing actives such as selenium sulfide. A very specificexample is zinc pyrithione (ZPT) at concentrations ranging from 0.005%to 2%, more preferably from about 0.005% to about 0.5%, by weight of thecomposition. Selenium sulfides are antimicrobial anti-dandruffs activewell known in the personal care arts and are described, for example, inU.S. Pat. No. 2,694,668; U.S. Pat. No. 3,152,046; U.S. Pat. No.4,089,945; and U.S. Pat. No. 4,885,107, which disclosures areincorporated in their entirety herein by this reference.

Pyrithione antimicrobial actives, especially 1-hydroxy-2-pyridinethionesalts, are also well-known anti-dandruff actives for use in the scalpcosmetic compositions. Examples of pyrithione salts are those formedfrom heavy metals such as zinc, tin, cadmium, magnesium, aluminum andzirconium. Zinc salts are particularly favored anti-dandruff agents,especially the zinc salt of 1-hydroxy-2-pyrithione (zinc pyrithione,ZPT). Other cations such as sodium may also be suitable. Pyrithioneantimicrobial actives are well known in the hair care art and aredescribed, for example, in U.S. Pat. No. 2,809,971; U.S. Pat. No.3,236,733; U.S. Pat. No. 3,753,196; U.S. Pat. No. 3,761,418; U.S. Pat.No. 4,345,080; U.S. Pat. No. 4,323,683; U.S. Pat. No. 4,379,753; andU.S. Pat. No. 4,470,982, the disclosures of which are incorporated intheir entirety herein by this reference. Other specific examples ofzinc-containing skin-active agents which may be suitable asanti-dandruff agents include zinc pyrithione, zinc acetate, zincacetylmethionate, zinc aspartate, zinc borate, zinc carbonate, zincchloride, zinc citrate, zinc DNA, zinc formaldehyde sulfoxylate, zincgluconate, zinc glutamate, zinc hydrolyzed collagen, zinc lactate, zinclaurate, zinc myristate, zinc neodecanoate, zinc palmitate, zinc PCA,zinc pentadecene tricarboxylate, zinc ricinoleate, zinc rosinate, zincstearate, zinc sulfate, zinc undecylenate, zinc oxide, zinclactobionate, and combinations thereof.

The terms “gene expression signature,” and “gene-expression signature”refer to a rationally derived list, or plurality of lists, of genesrepresentative of a skin tissue condition or a skin agent. In specificcontexts, the skin agent may be a benchmark skin agent or a potentialskin agent. Thus, the gene expression signature may serve as a proxy fora phenotype of interest for skin tissue. A gene expression signature maycomprise genes whose expression, relative to a normal or control state,is increased (up-regulated), whose expression is decreased(down-regulated), and combinations thereof. Generally, a gene expressionsignature for a modified cellular phenotype may be described as a set ofgenes differentially expressed in the modified cellular phenotype overthe cellular phenotype. A gene expression signature can be derived fromvarious sources of data, including but not limited to, from in vitrotesting, in vivo testing and combinations thereof. In some embodiments,a gene expression signature may comprise a first list representative ofa plurality of up-regulated genes of the condition of interest and asecond list representative of a plurality of down-regulated genes of thecondition of interest.

As used herein, the term “benchmark skin agent” refers to any chemical,compound, small or large molecule, extract, formulation, or combinationsthereof that is known to induce or cause a superior effect (positive ornegative) on skin tissue. Non-limiting examples of benchmark skin-activeagents well-known in the dandruff arts include Zinc pyrithione (ZPT),Selenium sulfide, ketoconazole, Ciclopirox olamine and tar. Zincpyrithione is commonly known as an antifungal and antibacterial agentand was first reported in the 1930s. Zinc pyrithione is best known forits use in the treatment of dandruff and seborrheic dermatitis. It alsohas antibacterial properties and is effective against many pathogensfrom the streptococcus and staphylococcus class. Its other medicalapplications include treatments of psoriasis, eczema, ringworm, fungus,athlete's foot, dry skin, atopic dermatitis, tinea, and vitiligo.Selenium sulfide is available as a 1% and 2.5% lotion and shampoo. Insome countries, the higher strength preparations require a doctor'sprescription. The shampoo is used to treat dandruff and seborrhea of thescalp, and the lotion is used to treat tinea versicolor, a fungalinfection of the skin. Tar is a skin-active agent known to be effectiveas a therapeutic treatment to control scalp itching and flakingsymptomatic of scalp psoriasis, eczema, seborrheic dermatitis anddandruff.

As used herein, the term “query” refers to data that is used as an inputto a Connectivity Map and against which a plurality of instances arecompared. A query may include a gene expression signature associatedwith a skin condition such as dandruff, or may include a gene expressionsignature derived from a physiological process signature determined fora skin condition. A C-map may be queried with perturbagens, geneexpression signatures, skin disorders, thematic signatures, or any datafeature or combination of data features or associations that comprisethe data architecture.

The term “instance,” as used herein, refers to data from a geneexpression profiling experiment in which skin cells are dosed with aperturbagen. In some embodiments, the data comprises a list ofidentifiers representing the genes that are part of the gene expressionprofiling experiment. The identifiers may include gene names, genesymbols, microarray probe set IDs, or any other identifier. In someembodiments, an instance may comprise data from a microarray experimentand comprises a list of probe set IDs of the microarray ordered by theirextent of differential expression relative to a control. The data mayalso comprise metadata, including but not limited to data relating toone or more of the perturbagen, the gene expression profiling testconditions, the skin cells, and the microarray.

The term “keratinous tissue,” as used herein, refers tokeratin-containing layers disposed as the outermost protective coveringof mammals which includes, but is not limited to, skin, hair, nails,cuticles, horns, claws, beaks, and hooves. With respect to skin, theterm refers to one or all of the dermal, hypodermal, and epidermallayers, which includes, in part, keratinous tissue.

As used herein, the term “dandruff” refers to a condition of scalpmarked by excessive flaking of scalp skin and typically accompanied byitching, regardless of etiology or pathogenic mechanism. Dandruff isdistinguished from seborrheic dermatitis by the presence of affectedskin outside the scalp in seborrheic dermatitis. The term “dandruff” mayalso refer to the flake itself.

The term “perturbagen,” as used herein, means anything used as achallenge in a gene expression profiling experiment to generate geneexpression data for use in the present invention. In some embodiments,the perturbagen is applied to keratinocyte cells and the gene expressiondata derived from the gene expression profiling experiment may be storedas an instance in a data architecture. Any substance, chemical,compound, active, natural product, extract, drug [e.g. Sigma-AldrichLOPAC (Library of Pharmacologically Active Compounds) collection], smallmolecule, and combinations thereof used as to generate gene expressiondata can be a perturbagen. A perturbagen can also be any other stimulusused to generate differential gene expression data. For example, aperturbagen may also be UV radiation, heat, osmotic stress, pH, amicrobe, a virus, a recombinant cytokine or growth factor, or smallinterfering RNA. A perturbagen may be, but is not required to be, anycosmetic agent.

The term “dermatologically acceptable,” as used herein, means that thecompositions or components described are suitable for use in contactwith human skin tissue without undue toxicity, incompatibility,instability, allergic response, and the like.

As used herein, the term “computer readable medium” refers to anyelectronic storage medium and includes but is not limited to anyvolatile, nonvolatile, removable, and non-removable media implemented inany method or technology for storage of information such as computerreadable instructions, data and data structures, digital files, softwareprograms and applications, or other digital information. Computerreadable media includes, but are not limited to, application-specificintegrated circuit (ASIC), a compact disk (CD), a digital versatile disk(DVD), a random access memory (RAM), a synchronous RAM (SRAM), a dynamicRAM (DRAM), a synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), a direct RAM bus RAM (DRRAM), a read only memory (ROM), aprogrammable read only memory (PROM), an electronically erasableprogrammable read only memory (EEPROM), a disk, a carrier wave, and amemory stick. Examples of volatile memory include, but are not limitedto, random access memory (RAM), synchronous RAM (SRAM), dynamic RAM(DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM),and direct RAM bus RAM (DRRAM). Examples of non-volatile memory include,but are not limited to, read only memory (ROM), programmable read onlymemory (PROM), erasable programmable read only memory (EPROM), andelectrically erasable programmable read only memory (EEPROM). A memorycan store processes and/or data. Still other computer readable mediainclude any suitable disk media, including but not limited to, magneticdisk drives, floppy disk drives, tape drives, Zip drives, flash memorycards, memory sticks, compact disk ROM (CD-ROM), CD recordable drive(CD-R drive), CD rewriteable drive (CD-RW drive), and digital versatileROM drive (DVD ROM).

As used herein, the terms “software” and “software application” refer toone or more computer readable and/or executable instructions that causea computing device or other electronic device to perform functions,actions, and/or behave in a desired manner. The instructions may beembodied in one or more various forms like routines, algorithms,modules, libraries, methods, and/or programs. Software may beimplemented in a variety of executable and/or loadable forms and can belocated in one computer component and/or distributed between two or morecommunicating, co-operating, and/or parallel processing computercomponents and thus can be loaded and/or executed in serial, parallel,and other manners. Software can be stored on one or more computerreadable medium and may implement, in whole or part, the methods andfunctionalities of the present invention.

As used herein, the term “dandruff gene expression signature” refers toa gene expression signature derived from gene expression profiling of adandruff condition.

As used herein, the term “connectivity score” refers to a derived valuerepresenting the degree to which an instance correlates to a query.

As used herein, the term “data architecture” refers generally to one ormore digital data structures comprising an organized collection of data.In some embodiments, the digital data structures can be stored as adigital file (e.g., a spreadsheet file, a text file, a word processingfile, a database file, etc.) on a computer readable medium. In someembodiments, the data architecture is provided in the form of a databasethat may be managed by a database management system (DBMS) that is beused to access, organize, and select data (e.g., instances and geneexpression signatures) stored in a database.

As used herein, the terms “gene expression profiling” and “geneexpression profiling experiment” refer to the measurement of theexpression of multiple genes in a biological sample using any suitableprofiling technology. For example, the mRNA expression of thousands ofgenes may be determined using microarray techniques. Other emergingtechnologies that may be used include RNA-Seq or whole transcriptomesequencing using NextGen sequencing techniques.

As used herein, the term “microarray” refers broadly to any orderedarray of nucleic acids, oligonucleotides, proteins, small molecules,large molecules, and/or combinations thereof on a substrate that enablesgene expression profiling of a biological sample. Non-limiting examplesof microarrays are available from Affymetrix, Inc.; AgilentTechnologies, Inc.; Ilumina, Inc.; GE Healthcare, Inc.; AppliedBiosystems, Inc.; Beckman Coulter, Inc.; etc.

Unless otherwise indicated, all numbers expressing quantities ofingredients, properties such as molecular weight, reaction conditions,and so forth as used in the specification and claims are to beunderstood as being modified in all instances by the term “about”.Additionally, the disclosure of any ranges in the specification andclaims are to be understood as including the range itself and alsoanything subsumed therein, as well as endpoints. All numeric ranges areinclusive of narrower ranges; delineated upper and lower range limitsare interchangeable to create further ranges not explicitly delineated.Unless otherwise indicated, the numerical properties set forth in thespecification and claims are approximations that may vary depending onthe desired properties sought to be obtained in embodiments of thepresent invention. Notwithstanding that numerical ranges and parameterssetting forth the broad scope of the invention are approximations, thenumerical values set forth in the specific examples are reported asprecisely as possible. Any numerical values, however, inherently containcertain errors necessarily resulting from error found in theirrespective measurements.

In accordance with one aspect of the present invention, provided aredevices, systems and methods for implementing a connectivity maputilizing one or more query signatures associated with a dandruff or adandruff-related condition. The query signatures may be derived invariety of ways. In some embodiments, the query signatures may be geneexpression signatures derived from gene expression profiling of fullthickness skin biopsies of skin exhibiting a skin condition of interestcompared to a control. The gene expression profiling can be carried outusing any suitable technology, including but not limited to microarrayanalysis or NextGen sequencing. An example of a gene expressionsignature includes a specific dandruff gene expression signature, anexample of which is described more fully hereafter. A query signaturemay be derived from transcriptional profiling of a keratinocyte cellline exposed to benchmark skin-active agents such as anti-dandruffagents. In other embodiments, the query signature may be a physiologicaltheme expression signature derived from an analysis of statisticallyover-represented Gene Ontology processes and determining statisticalclustering of the regulated genes as a function of the Gene Ontology.These query signatures may be used singularly or in combination.

In accordance with another aspect of the present invention, provided aredevices, systems, and methods for implementing a connectivity maputilizing one or more instances derived from a perturbagen, such as acosmetic agent, exposed to an epidermal keratinocyte cell line.Instances from more complex cell culture systems may also be used, suchas skin organotypic cultures containing keratinocytes or ex vivo humanskin. Instances from a plurality of cell lines may be used with thepresent invention.

In accordance with yet another aspect of the present invention, providedare devices, systems and methods for identification of relationshipsbetween a skin condition, e.g. dandruff condition query signature and aplurality of instances, where the query signature may be a geneexpression signature or a physiological theme expression signature. Forexample, it may be possible to ascertain perturbagens that give rise toa statistically significant activity on a statistically significantnumber of genes associated with a skin condition of interest, leading tothe identification of new cosmetic agents for treating the skincondition or new uses of known cosmetic agents.

I. Systems and Devices

Referring to FIGS. 2, 4 and 5, some examples of systems and devices inaccordance with the present invention for use in identifyingrelationships between perturbagens, skin tissue/dandruff conditions, andgenes associated with the skin tissue/dandruff condition will now bedescribed. System 10 comprises one or more of computing devices 12, 14,a computer readable medium 16 associated with the computing device 12,and communication network 18.

The computer readable medium 16, which may be provided as a hard diskdrive, comprises a digital file 20, such as a database file, comprisinga plurality of instances 22, 24, and 26 stored in a data structureassociated with the digital file 20. The plurality of instances may bestored in relational tables and indexes or in other types of computerreadable media. The instances 22, 24, and 26 may also be distributedacross a plurality of digital files, a single digital file 20 beingdescribed herein however for simplicity.

The digital file 20 can be provided in wide variety of formats,including but not limited to a word processing file format (e.g.,Microsoft Word), a spreadsheet file format (e.g., Microsoft Excel), anda database file format. Some common examples of suitable file formatsinclude, but are not limited to, those associated with file extensionssuch as *.xls, *.xld, *.xlk, *.xll, *.xlt, *.xlxs, *.dif, *.db, *.dbf,*.accdb, *.mdb, *.mdf, *.cdb, *.fdb, *.csv, *sql, *.xml, *.doc, *.txt,*.rtf, *.log, *.docx, *.ans, *.pages, *.wps, etc.

Referring to FIG. 3, in some embodiments the instance 22 may comprise anordered listing of microarray probe set IDs, wherein the value of N isequal to the total number of probes on the microarray used in analysis.Common microarrays include Affymetrix GeneChips and Illumina BeadChips,both of which comprise probe sets and custom probe sets. To generate thereference gene profiles according to the invention, preferred chips arethose designed for profiling the human genome. Examples of Affymetrixchips with utility in the instant invention include model Human Genome(HG)-U133 Plus 2.0. A specific Affymetrix chip employed by the instantinvestigators is HG-U133A2.0, however it will be understood by a personor ordinary skill in the art that any chip or microarray, regardless ofproprietary origin, is suitable so long as the probe sets of the chipsused to construct a data architecture according to the invention aresubstantially similar.

Instances derived from microarray analyses utilizing AffymetrixGeneChips may comprise an ordered listing of gene probe set IDs wherethe list comprises 22,000+ IDs. The ordered listing may be stored in adata structure of the digital file 20 and the data arranged so that,when the digital file is read by the software application 28, aplurality of character strings are reproduced representing the orderedlisting of probe set IDs. While it is preferred that each instancecomprise a full list of the probe set IDs, it is contemplated that oneor more of the instances may comprise less than all of the probe set IDsof a microarray. It is also contemplated that the instances may includeother data in addition to or in place of the ordered listing of probeset IDs. For example, an ordered listing of equivalent gene names and/orgene symbols may be substituted for the ordered listing of probe setIDs. Additional data may be stored with an instance and/or the digitalfile 20. In some embodiments, the additional data is referred to asmetadata and can include one or more of cell line identification, batchnumber, exposure duration, and other empirical data, as well as anyother descriptive material associated with an instance ID. The orderedlist may also comprise a numeric value associated with each identifierthat represents the ranked position of that identifier in the orderedlist.

Referring again to FIGS. 2, 3 and 4, the computer readable medium 16 mayalso have a second digital file 30 stored thereon. The second digitalfile 30 comprises one or more lists 32 of microarray probe set IDsassociated with one or more dandruff gene expression signatures. Thelisting 32 of microarray probe set IDs typically comprises a muchsmaller list of probe set IDs than the instances of the first digitalfile 20. In some embodiments, the list comprises between 2 and 1000probe set IDs. In other embodiments the list comprises greater than 10,50, 100, 200, or 300 and/or less than about 800, 600, or about 400 probeset IDs. The listing 32 of probe set IDs of the second digital file 30comprises a list of probe set IDs representing up, and/or down-regulatedgenes selected to represent a skin condition of interest. In someembodiments, a first list may represent the up-regulated genes and asecond list may represent the down-regulated genes of the geneexpression signature. The listing(s) may be stored in a data structureof the digital file 30 and the data arranged so that, when the digitalfile is read by the software application 28, a plurality of characterstrings are reproduced representing the list of probe set IDs. Insteadof probe set IDs, equivalent gene names and/or gene symbols (or anothernomenclature) may be substituted for a list of probe set IDs. Additionaldata may be stored with the gene expression signature and/or the digitalfile 30 and this is commonly referred to as metadata, which may includeany associated information, for example, cell line or sample source, andmicroarray identification. Examples of listings of probe set IDs for adandruff gene expression signature is set forth in Tables A(up-regulated and down-regulated genes clustered in a physiologicaltheme signature/pattern) and B (the 70 most up-regulated and 70 mostdown-regulated genes in a dandruff gene expression signature). In someembodiments, one or more skin condition/dandruff gene expressionsignatures may be stored in a plurality of digital files and/or storedon a plurality of computer readable media. In other embodiments, aplurality of gene expression signatures (e.g., 32, 34) may be stored inthe same digital file (e.g., 30) or stored in the same digital file ordatabase that comprises the instances 22, 24, and 26.

As previously described, the data stored in the first and second digitalfiles may be stored in a wide variety of data structures and/or formats.In some embodiments, the data is stored in one or more searchabledatabases, such as free databases, commercial databases, or a company'sinternal proprietary database. The database may be provided orstructured according to any model known in the art, such as for exampleand without limitation, a flat model, a hierarchical model, a networkmodel, a relational model, a dimensional model, or an object-orientedmodel. In some embodiments, at least one searchable database is acompany's internal proprietary database. A user of the system 10 may usea graphical user interface associated with a database management systemto access and retrieve data from the one or more databases or other datasources to which the system is operably connected. In some embodiments,the first digital file 20 is provided in the form of a first databaseand the second digital file 30 is provided in the form of a seconddatabase. In other embodiments, the first and second digital files maybe combined and provided in the form of a single file.

In some embodiments, the first digital file 20 may include data that istransmitted across the communication network 18 from a digital file 36stored on the computer readable medium 38. In one embodiment, the firstdigital file 20 may comprise gene expression data obtained from a cellline (e.g., a fibroblast cell line and/or a keratinocyte cell line) aswell as data from the digital file 36, such as gene expression data fromother cell lines or cell types, gene expression signatures, perturbageninformation, clinical trial data, scientific literature, chemicaldatabases, pharmaceutical databases, and other such data and metadata.The digital file 36 may be provided in the form of a database, includingbut not limited to Sigma-Aldrich LOPAC collection, Broad Institute C-MAPcollection, GEO collection, and Chemical Abstracts Service (CAS)databases.

The computer readable medium 16 (or another computer readable media,such as 16) may also have stored thereon one or more digital files 28comprising computer readable instructions or software for reading,writing to, or otherwise managing and/or accessing the digital files 20,30. The computer readable medium 16 may also comprise software orcomputer readable and/or executable instructions that cause thecomputing device 12 to perform one or more steps of the methods of thepresent invention, including for example and without limitation, thestep(s) associated with comparing a gene expression signature stored indigital file 30 to instances 22, 24, and 26 stored in digital file 20.In some embodiments, the one or more digital files 28 may form part of adatabase management system for managing the digital files 20, 28.Non-limiting examples of database management systems are described inU.S. Pat. Nos. 4,967,341 and 5,297,279.

The computer readable medium 16 may form part of or otherwise beconnected to the computing device 12. The computing device 12 can beprovided in a wide variety of forms, including but not limited to anygeneral or special purpose computer such as a server, a desktopcomputer, a laptop computer, a tower computer, a microcomputer, a minicomputer, and a mainframe computer. While various computing devices maybe suitable for use with the present invention, a generic computingdevice 12 is illustrated in FIG. 4. The computing device 12 may compriseone or more components selected from a processor 40, system memory 42,and a system bus 44. The system bus 44 provides an interface for systemcomponents including but not limited to the system memory 42 andprocessor 40. The system bus 36 can be any of several types of busstructures that may further interconnect to a memory bus (with orwithout a memory controller), a peripheral bus, and a local bus usingany of a variety of commercially available bus architectures. Examplesof a local bus include an industrial standard architecture (USA) bus, amicrochannel architecture (MSA) bus, an extended ISA (EISA) bus, aperipheral component interconnect (PCI) bus, a universal serial (USB)bus, and a small computer systems interface (SCSI) bus. The processor 40may be selected from any suitable processor, including but not limitedto, dual microprocessor and other multi-processor architectures. Theprocessor executes a set of stored instructions associated with one ormore program applications or software.

The system memory 42 can include non-volatile memory 46 (e.g., read onlymemory (ROM), erasable programmable read only memory (EPROM),electrically erasable programmable read only memory (EEPROM), etc.)and/or volatile memory 48 (e.g., random access memory (RAM)). A basicinput/output system (BIOS) can be stored in the non-volatile memory 38,and can include the basic routines that help to transfer informationbetween elements within the computing device 12. The volatile memory 48can also include a high-speed RAM such as static RAM for caching data.

The computing device 12 may further include a storage 44, which maycomprise, for example, an internal hard disk drive [HDD, e.g., enhancedintegrated drive electronics (EIDE) or serial advanced technologyattachment (SATA)] for storage. The computing device 12 may furtherinclude an optical disk drive 46 (e.g., for reading a CD-ROM or DVD-ROM48). The drives and associated computer-readable media providenon-volatile storage of data, data structures and the data architectureof the present invention, computer-executable instructions, and soforth. For the computing device 12, the drives and media accommodate thestorage of any data in a suitable digital format. Although thedescription of computer-readable media above refers to an HDD andoptical media such as a CD-ROM or DVD-ROM, it should be appreciated bythose skilled in the art that other types of media which are readable bya computer, such as Zip disks, magnetic cassettes, flash memory cards,cartridges, and the like may also be used, and further, that any suchmedia may contain computer-executable instructions for performing themethods of the present invention.

A number of software applications can be stored on the drives 44 andvolatile memory 48, including an operating system and one or moresoftware applications, which implement, in whole or part, thefunctionality and/or methods described herein. It is to be appreciatedthat the embodiments can be implemented with various commerciallyavailable operating systems or combinations of operating systems. Thecentral processing unit 40, in conjunction with the softwareapplications in the volatile memory 48, may serve as a control systemfor the computing device 12 that is configured to, or adapted to,implement the functionality described herein.

A user may be able to enter commands and information into the computingdevice 12 through one or more wired or wireless input devices 50, forexample, a keyboard, a pointing device, such as a mouse (notillustrated), or a touch screen. These and other input devices are oftenconnected to the central processing unit 40 through an input deviceinterface 52 that is coupled to the system bus 44 but can be connectedby other interfaces, such as a parallel port, an IEEE 1394 serial port,a game port, a universal serial bus (USB) port, an IR interface, etc.The computing device 12 may drive a separate or integral display device54, which may also be connected to the system bus 44 via an interface,such as a video port 56.

The computing devices 12, 14 may operate in a networked environmentacross network 18 using a wired and/or wireless network communicationsinterface 58. The network interface port 58 can facilitate wired and/orwireless communications. The network interface port can be part of anetwork interface card, network interface controller (NIC), networkadapter, or LAN adapter. The communication network 18 can be a wide areanetwork (WAN) such as the Internet, or a local area network (LAN). Thecommunication network 18 can comprise a fiber optic network, atwisted-pair network, a T1/E1 line-based network or other links of theT-carrier/E carrier protocol, or a wireless local area or wide areanetwork (operating through multiple protocols such as ultra-mobile band(UMB), long term evolution (LTE), etc.). Additionally, communicationnetwork 18 can comprise base stations for wireless communications, whichinclude transceivers, associated electronic devices formodulation/demodulation, and switches and ports to connect to a backbonenetwork for backhaul communication such as in the case ofpacket-switched communications.

II. Methods for Creating a Plurality of Instances

In some embodiments, the methods of the present invention may comprisepopulating at least the first digital file 20 with a plurality ofinstances (e.g., 22, 24, 26) comprising data derived from a plurality ofgene expression profiling experiments, wherein one or more of theexperiments comprise exposing, for example, keratinocyte cells (or otherskin cells such as human skin equivalent cultures or ex vivo culturedhuman skin) to at least one perturbagen. For simplicity of discussion,the gene expression profiling discussed hereafter will be in the contextof a microarray experiment.

Referring to FIG. 5, one embodiment of a method of the present inventionis illustrated. The method 58 comprises exposing a keratinocyte cell toa perturbagen 64. The perturbagen may be dissolved in a carrier, such asdimethyl sulfoxide (DMSO). After exposure, mRNA is extracted from thecells exposed to the perturbagen and reference cells 66 (e.g.,keratinocyte cells) which are exposed to only the carrier. The mRNA 68,70, 72 may be reverse transcribed to cDNA 64, 76, 78 and marked withdifferent fluorescent dyes (e.g., red and green) if a two colormicroarray analysis is to be performed. Alternatively, the samples maybe prepped for a one color microarray analysis, and further a pluralityof replicates may be processed if desired. The cDNA samples may beco-hybridized to the microarray 80 comprising a plurality of probes 82.The microarray may comprise thousands of probes 82. In some embodiments,there are between 10,000 and 50,000 gene probes 82 present on themicroarray 80. The microarray is scanned by a scanner 84, which excitesthe dyes and measures the amount fluorescence. A computing device 86 maybe used to analyze the raw images to determine the expression levels ofa gene in the cells 60, 62 relative to the reference cells 66. Thescanner 84 may incorporate the functionality of the computing device 86.The expression levels include: i) up-regulation [e.g., greater bindingof the test material (e.g., cDNA 74, 76) to the probe than the referencematerial (e.g., cDNA 78)], or ii) down-regulation [e.g., greater bindingof the reference material (e.g., cDNA 78) to the probe than the testmaterial (e.g., cDNA 74, 76)], iii) expressed but not differentially[e.g., similar binding of the reference material (e.g., cDNA 78) to theprobe than the test material (e.g., cDNA 74. 76)], and iv) no detectablesignal or noise. The up- and down-regulated genes are referred to asdifferentially expressed. Microarrays and microarray analysis techniquesare well known in the art, and it is contemplated that other microarraytechniques may be used with the methods, devices and systems of thepresent invention. For example, any suitable commercial ornon-commercial microarray technology and associated techniques may used.Good results have been obtained with Affymetrix GeneChip® technology andIllumina BeadChip™ technology. One illustrative technique is describedin the Examples, “Generally Applicable” methods section. However, one ofskill in the art will appreciate that the present invention is notlimited to the methodology of the example and that other methods andtechniques are also contemplated to be within its scope.

In a very specific embodiment, an instance consists of the rank ordereddata for all of the probe sets on the Affymetrix HG-U133A2.0 GeneChipwherein each probe on the chip has a unique probe set IDentifier. Theprobe sets are rank ordered by the fold change relative to the controlsin the same C-map batch (single instance/average of controls). The probeset IDentifiers are rank-ordered to reflect the most up-regulated to themost down-regulated.

Notably, even for the non-differentially regulated genes the signalvalues for a particular probe set are unlikely to be identical for theinstance and control so a fold change different from 1 will becalculated that can be used for comprehensive rank ordering. Inaccordance with methods disclosed by Lamb et al. (2006), data areadjusted using 2 thresholds to minimize the effects of genes that mayhave very low noisy signal values, which can lead to spurious large foldchanges. The thresholding is preferably done before the rank ordering.An example for illustrative purposes includes a process wherein a firstthreshold is set at 20. If the signal for a probe set is below 20, it isadjusted to 20. Ties for ranking are broken with a second thresholdwherein the fold changes are recalculated and any values less than 2 areset to 2. For any remaining ties the order depends on the specificsorting algorithm used but is essentially random. The probe sets in themiddle of the list do not meaningfully contribute to an actualconnectivity score.

The rank ordered data are stored as an instance. The probes may besorted into a list according to the level of gene expression regulationdetected, wherein the list progresses from up-regulated to marginal orno regulation to down-regulated, and this rank ordered listing of probeIDs is stored as an instance (e.g., 22) in the first digital file 20.Referring to FIG. 3, the data associated with an instance comprises theprobe ID 80 and a value 82 representing its ranking in the list (e.g.,1, 2, 3, 4 . . . N, where N represents the total number of probes on themicroarray). The ordered list 84 may generally comprise approximatelythree groupings of probe IDs: a first grouping 86 of probe IDsassociated with up-regulated genes, a second group 88 of probe IDsassociated with genes with marginal regulation or no detectable signalor noise, and a third group 90 of probe IDs associated withdown-regulated genes. The most up-regulated genes are at or near the topof the list 84 and the most down-regulated genes are at or near thebottom of the list 84. The groupings are shown for illustration, but thelists for each instance may be continuous and the number of regulatedgenes will depend on the strength of the effect of the perturbagenassociated with the instance. Other arrangements within the list 84 maybe provided. For example, the probe IDs associated with thedown-regulated genes may be arranged at the top of the list 84. Thisinstance data may also further comprise metadata such as perturbagenidentification, perturbagen concentration, cell line or sample source,and microarray identification.

In some embodiments, one or more instances comprise at least about1,000, 2,500, 5,000, 10,000, or 20,000 identifiers and/or less thanabout 30,000, 25,000, or 20,000 identifiers. In some embodiments, thedatabase comprises at least about 50, 100, 250, 500, or 1,000 instancesand/or less than about 50,000, 20,000, 15,000, 10,000, 7,500, 5,000, or2,500 instances. Replicates of an instance may be created, and the sameperturbagen may be used to derive a first instance from keratinocytecells and a second instance from another skin cell type, such asfibroblasts, melanocytes or complex tissue, for example ex vivo humanskin.

The present inventors have surprisingly discovered that instancesderived from keratinocyte cells appear to be more predictive than othercell types when used in combination with a dandruff condition expressionsignature. As described more fully hereafter in Example 3, the presentinventors compared instances derived from BJ fibroblast cells andkeratinocyte cells with a dandruff gene expression signature and foundthat instances derived from the keratinocyte cells were dramaticallyover represented in the highest ranking results (the higher the ranking,the more likely the perturbagen is to have a beneficial affect upon thedandruff condition) compared to fibroblast cells.

III. Methods for Deriving Dandruff Gene Expression Signatures

Some methods of the present invention comprise identifying a geneexpression signature that represents the up-regulated and down-regulatedgenes associated with a skin condition of interest, in particular withDandruff The pathogenesis of Dandruff typically involves complexprocesses involving numerous known and unknown extrinsic and intrinsicfactors, as well as responses to such factors that are subtle over arelatively short period of time but non-subtle over a longer period oftime. This is in contrast to what is typically observed in drugdevelopment and drug screening methods, wherein a specific target, gene,or mechanism of action is of interest. Due to the unique screeningchallenges associated with the dandruff condition, the quality of thegene expression signature representing the condition of interest can beimportant for distinguishing between the gene expression data actuallyassociated with a response to a perturbagen from the backgroundexpression data.

One challenge in developing gene expression signatures for dandruff anddandruff-related skin disorders is that the number of genes selectedneeds to be adequate to reflect the dominant and key biology but not solarge as to include many genes that have achieved a level of statisticalsignificance by random chance and are non-informative. Thus, querysignatures should be carefully derived since the predictive value may bedependent upon the quality of the gene expression signature.

One factor that can impact the quality of the query signature is thenumber of genes included in the signature. The present inventors havefound that, with respect to a cosmetic data architecture andconnectivity map, too few genes can result in a signature that isunstable with regard to the highest scoring instances. In other words,small changes to the gene expression signature can result in significantdifferences in the highest scoring instance. Conversely, too many genesmay tend to partially mask the dominant biological responses and willinclude a higher fraction of genes meeting statistical cutoffs by randomchance—thereby adding undesirable noise to the signature. The inventorshave found that the number of genes desirable in a gene expressionsignature is also a function of the strength of the biological responseassociated with the condition and the number of genes needed to meetminimal values (e.g., a p-value less than about 0.05) for statisticalsignificance. Hence, what is considered an ideal number of genes willvary from condition to condition. When the biology is weaker, such as isthe case typically with cosmetic condition phenotypes, fewer genes thanthose which may meet the statistical requisite for inclusion in theprior art, may be used to avoid adding noisy genes.

For example, the present inventors have determined that where geneexpression profiling analysis of a skin condition yields from betweenabout 2,000 and 4,000 genes having a statistical p-value of less than0.05 and approximately 1000 genes having a p-value of less than 0.001, avery strong biological response is indicated. A moderately strongbiological response may yield approximately 800-2000 genes have astatistical p-value of less than 0.05 combined with approximately400-600 genes have a p-value of less than 0.001. In these cases, a geneexpression signature comprising between about 100 and about 600 genesappears ideal. Weaker biology may be better represented by a geneexpression signature comprising fewer genes, such as between about 20and 100 genes.

While a gene expression signature may represent all significantlyregulated genes associated with a skin condition of interest; typicallyit represents a subset of such genes. The present inventors havediscovered that dandruff-related gene expression signatures comprisingbetween about 100 and about 400 genes of approximately equal numbers ofup-regulated and/or down-regulated genes are stable, reliable, and canprovide predictive results. For example, a suitable gene expressionsignature may have from about 100-150 genes, 250-300 genes, 300-350genes, or 350-400 genes. In a very specific embodiment, an unhealthyskin gene expression signature includes the 70 most up- anddown-regulated genes. However, one of skill in the art will appreciatethat gene expression signatures comprising fewer or more genes are alsowithin the scope of the various embodiments of the invention. Forpurposes of depicting a gene expression signature, the probe set IDsassociated with the genes are preferably separated into a first listcomprising the most up-regulated genes and a second list comprising themost down-regulated, as set forth in FIG. 1A and FIG. 1B, Table B.

Gene expression signatures may be generated from full thickness skinbiopsies from skin having the skin condition of interest compared to acontrol. For generation of dandruff gene expression signatures, biopsiesare taken from dandruff-affected scalp skin and compared to non-dandruffaffected scalp skin sampled from an anatomically comparable site in anunaffected subject. The present investigators determined that withrespect to a subject suffering from any dandruff, even scalp skin thatis free of dandruff lesions has a perturbed thematic profile.

In other embodiments of the present invention, a gene expressionsignature may be derived from a gene expression profiling analysis ofkeratinocyte cells treated with a benchmark skin-active agent, inparticular an anti-dandruff agent, to represent cellular perturbationsleading to improvement in the skin tissue condition treated with thatbenchmark skin agent, wherein the signature comprises a plurality ofgenes up-regulated and down-regulated by the benchmark skin agent incells in vitro. As one illustrative example, microarray gene expressionprofile data where the perturbagen is the known anti-dandruff agent ZPTmay be analyzed using the present invention to determine a subset of themost highly significantly regulated genes. Thus, a list of genesstrongly up-regulated and strongly down-regulated in response tochallenge with ZPT can be derived, and the list of genes (a proxy forthe dandruff condition) can be used as a query signature to screen foranti-dandruff agents. In another embodiment, a signature may be derivedto represent more than one aspect of the condition of interest.

In some embodiments a gene expression signature may be mapped onto abiological process grid or Gene Ontology, to yield a physiological themepattern. The broadest pattern would include all themes where genes arestatistically clustered. A more circumscribed pattern might include asubset of themes populated with the strongest-regulated genes, or asubset that is unique with respect to related disorders and thereforemay provide a tool for differential diagnosis, or a tool for screeningfor actives having very precise and targeted effects. It will be clearthat gene signatures derived from Gene Ontology and thematic patternanalysis will generally include fewer genes. An exemplary gene signaturebased on the lipid-immune/inflammation theme discovered by the presentinvestigators as particularly relevant for dandruff is set forth in FIG.1A and FIG. 1B, Table A.

IV. Methods for Comparing a Plurality of Instances to One or MoreDandruff Gene Expression Signatures

Referring to FIG. 6 and FIG. 7, a method for querying a plurality ofinstances with one or more dandruff gene signatures will now bedescribed. Broadly, the method comprises querying a plurality ofinstances with one or more dandruff gene signatures and applying astatistical method to determine how strongly the signature genes matchthe regulated genes in an instance. Positive connectivity occurs whenthe genes in the up-regulated signature list are enriched among theup-regulated genes in an instance and the genes in the down-regulatedsignature list are enriched among the down-regulated genes in aninstance. On the other hand, if the up-regulated genes of the signatureare predominantly found among the down-regulated genes of the instance,and vice versa, this is scored as negative connectivity. FIG. 6schematically illustrates an extreme example of a positive connectivitybetween signature 90 and the instance 104 comprising the probe IDs 102,wherein the probe IDs of the instance are ordered from most up-regulatedto most down-regulated. In this example, the probe IDs 100 (e.g., X₁, X₂X₃, X₄, X₅, X₆, X₇, X₈) of the gene signature 90, comprising an up list97 and a down list 99, have a one to one positive correspondence withthe most up-regulated and down-regulated probe IDs 102 of the instance104, respectively. Similarly, FIG. 7 schematically illustrates anextreme example of a negative connectivity between signature 94 and theinstance 88 comprising the probe IDs 90, wherein the probe IDs of theinstance are ordered from most up-regulated to most down-regulated. Inthis example, the probe IDs of the up list 93 (e.g., X₁, X₂ X₃, X₄)correspond exactly with the most down-regulated genes of the instance88, and the probe IDs of the down list 95 (e.g., X₅, X₆, X₇, X₈)correspond exactly to the most up-regulated probe IDs of the instance88. FIG. 8 schematically illustrates an extreme example of neutralconnectivity, wherein there is no consistent enrichment of the up- anddown-regulated genes of the signature among the up- and down-regulatedgenes of the instance, either positive or negative. Hence the probe IDs106 (e.g., X₁, X₂ X₃, X₄, X₅, X₆, X₇, X₈) of a gene signature 108(comprising an up list 107 and a down list 109) are scattered withrespect to rank with the probe IDs 110 of the instance 112, wherein theprobe IDs of the instance are ordered from most up-regulated to mostdown-regulated. While the above embodiments illustrate process where thegene signature comprises both an up list and a down list representativeof the most significantly up- and down-regulated genes of a skincondition, it is contemplated that the gene signature may comprise onlyan up list or a down list when the dominant biology associated with acondition of interest shows gene regulation in predominantly onedirection.

In some embodiments, the connectivity score can be a combination of anup-score and a down score, wherein the up-score represents thecorrelation between the up-regulated genes of a gene signature and aninstance and the down-score represents the correlation between thedown-regulated genes of a gene signature and an instance. The up scoreand down score may have values between +1 and −1. For an up score (anddown score) a high positive value indicates that the correspondingperturbagen of an instance induced the expression of the microarrayprobes of the up-regulated (or down-regulated) genes of the genesignature, and a high negative value indicates that the correspondingperturbagen associated with the instance repressed the expression of themicroarray probes of the up-regulated (or down-regulated) genes of thegene signature. The up-score can be calculated by comparing eachidentifier of an up list of a gene signature comprising the up-regulatedgenes (e.g., Tables A, C, I and lists 93, 97, and 107) to an orderedinstance list (e.g., Tables E, F, G, H) while the down-score can becalculated by comparing each identifier of a down list of a genesignature comprising the down-regulated genes (see, e.g., Tables B, D, Jand down lists 95, 99, and 109) to an ordered instance list (e.g.,Tables E, F, G, H). In these embodiments, the gene signature comprisesthe combination of the up list and the down list.

In some embodiments, the connectivity score value may range from +2(greatest positive connectivity) to −2 (greatest negative connectivity),wherein the connectivity score (e.g., 101, 103, and 105) is thecombination of the up score (e.g., 111, 113, 115) and the down score(e.g., 117, 119, 121) derived by comparing each identifier of a genesignature to the identifiers of an ordered instance list. In otherembodiments the connectivity range may be between +1 and −1. Examples ofthe scores are illustrated in FIGS. 6, 7 and 8 as reference numerals101, 103, 105, 111, 113, 115, 117, 119, and 121. The strength ofmatching between a signature and an instance represented by the upscores and down scores and/or the connectivity score may be derived byone or more approaches known in the art and include, but are not limitedto, parametric and non-parametric approaches. Examples of parametricapproaches include Pearson correlation (or Pearson r) and cosinecorrelation. Examples of non-parametric approaches include Spearman'sRank (or rank-order) correlation, Kendall's Tau correlation, and theGamma statistic. Generally, in order to eliminate a requirement that allprofiles be generated on the same microarray platform, a non-parametric,rank-based pattern matching strategy based on the Kolmogorov-Smirnovstatistic (see M. Hollander et al. “Nonparametric Statistical Methods”;Wiley, New York, ed. 2, 1999)(see, e.g., pp. 178-185) can be used. It isnoted, however, that where all expression profiles are derived from asingle technology platform, similar results may be obtained usingconventional measures of correlation, for example, the Pearsoncorrelation coefficient.

In specific embodiments, the methods and systems of the presentinvention employ the nonparametric, rank-based pattern-matching strategybased on the Kolmogorov-Smirnov statistic, which has been refined forgene profiling data by Lamb's group, commonly known in the art as GeneSet Enrichment Analysis (GSEA) (see, e.g., Lamb et al. 2006 andSubramanian, A. et al. (2005) Proc. Natl. Acad Sci U.S.A, 102,15545-15550). For each instance, a down score is calculated to reflectthe match between the down-regulated genes of the query and theinstance, and an up score is calculated to reflect the correlationbetween the up-regulated genes of the query and the instance. In certainembodiments the down score and up score each may range between −1 and+1. The combination represents the strength of the overall match betweenthe query signature and the instance.

The combination of the up score and down score is used to calculate anoverall connectivity score for each instance, and in embodiments whereup and down score ranges are set between −1 and +1, the connectivityscore ranges from −2 to +2, and represents the strength of match betweena query signature and the instance. The sign of the overall score isdetermined by whether the instance links positivity or negatively to thesignature. Positive connectivity occurs when the perturbagen associatedwith an instance tends to up-regulate the genes in the up list of thesignature and down-regulate the genes in the down list. Conversely,negative connectivity occurs when the perturbagen tends to reverse theup and down signature gene expression changes, The magnitude of theconnectivity score is the sum of the absolute values of the up and downscores when the up and down scores have different signs. A high positiveconnectivity score predicts that the perturbagen will tend to induce thecondition that was used to generate the query signature, and a highnegative connectivity score predicts that the perturbagen will tend toreverse the condition associated with the query signature. A zero scoreis assigned where the up and down scores have the same sign, indicatingthat a perturbagen did not have a consistent impact the conditionsignature (e.g., up-regulating both the up and down lists).

According to Lamb et al. (2006), there is no standard for estimatingstatistical significance of connections observed. Lamb teaches that thepower to detect connections may be greater for compounds with manyreplicates. Replicating in this context means that the same perturbagenis profiled multiple times. Where batch to batch variation must beavoided, a perturbagen should be profiled multiple times in each batch.However, since microarray experiments tend to have strong batch effectsit is desirable to replicate instances in different batches (i.e.,experiments) to have the highest confidence that connectivity scores aremeaningful and reproducible.

Each instance may be rank ordered according to its connectivity score tothe query signature and the resulting rank ordered list displayed to auser using any suitable software and computer hardware allowing forvisualization of data.

In some embodiments, the methods may comprise identifying from thedisplayed rank-ordered list of instances (i) the one or moreperturbagens associated with the instances of interest (therebycorrelating activation or inhibition of a plurality of genes listed inthe query signature to the one or more perturbagens); (ii) thedifferentially expressed genes associated with any instances of interest(thereby correlating such genes with the one or more perturbagens, theskin tissue condition of interest, or both); (iii) the cells associatedwith any instance of interest (thereby correlating such cells with oneor more of the differentially expressed genes, the one or moreperturbagens, and the skin tissue condition of interest); or (iv)combinations thereof. The one or more perturbagens associated with aninstance may be identified from the metadata stored in the database forthat instance. However, one of skill in the art will appreciate thatperturbagen data for an instance may be retrievably stored in and byother means. Because the identified perturbagens statistically correlateto activation or inhibition of genes listed in the query signature, andbecause the query signature is a proxy for a skin tissue condition ofinterest, the identified perturbagens may be candidates for new cosmeticagents, new uses of known cosmetic agents, or to validate known agentsfor known uses.

In some embodiments, the methods of the present invention may furthercomprise testing the selected candidate cosmetic agent, using in vitroassays and/or in vivo testing, to validate the activity of the agent andusefulness as a cosmetic agent. Any suitable in vitro test method can beused, including those known in the art, and most preferably in vitromodels developed in accordance with the present invention. For example,MatTek human skin equivalent cultures or other skin equivalent culturesmay be treated with one or a combination of perturbagens selected formimicry of the skin condition of interest with respect to regulation ofthe genes constituting a physiological theme pattern for the skincondition of interest. The treated skin culture replicates the, forexample, dandruff condition where it is treated with IL17 and IL22 inaccordance with the instant invention, and perturbagens may be screenedfor their ability to shift the homeostatic equilibrium of the treatedskin culture toward healthy skin, as determined by transcriptionalanalysis. Skin biopsy assays may also be used to evaluate candidateskin-active agents as anti-dandruff agents. In some embodiments,evaluation of selected agents using in vitro assays may reveal, confirm,or both, that one or more new candidate cosmetic agents may be used inconjunction with a known cosmetic agent (or a combination of knowncosmetic agents) to regulate a skin condition of interest.

V. Methods for Developing In Vitro Models of Skin Disease Conditions

The present investigators discovered a novel application of C-map toderive in vitro models of skin disease conditions and to evaluate thesufficiency of in vitro or in vivo simulations of disease states.

A great challenge in the identification of new therapeutics is thedevelopment of in vitro models that are predictive of clinical efficacy.Because no animal models of the dandruff condition are available, thereis a need for a model with high fidelity to the internal disease stateso that it recapitulates the key features of dandruff lesional skin invivo. The challenge of developing a good in vitro model for skinconditions such as dandruff is complicated by the fact that the eventsthat trigger the development of dandruff are poorly understood.Transcriptomic profiling work in dandruff lesional skin has providedmany new clues, chief among them evidence for a Th-17 driveninflammatory process. Without fully understanding how such a process isinitiated in vivo, the present inventors surprisingly discovered that itis possible to simulate such a cascade in vitro by administering to skincultures the key proinflammatory cytokines produced by Th-17 cells,IL-17A and IL-22.

Hence, it is possible to create an inflammatory milieu that resemblesdandruff lesional skin. Indeed, investigation revealed that within fourdays of administration of human recombinant IL-22 and IL-17A into theculture medium of human 3-dimensional organotypic cultures, hyperplasiawas produced, differentiation marker expression (e.g. K1/K10, S100A7)was perturbed, and secretion of IL-8 increased. All of these endpointsare features of dandruff lesional skin, and all substances that possessanti-dandruff activity in vivo are capable of blocking these responsesin the novel in vitro model. These substances include selenium sulfide,ZPT, ketoconazole, clobetasol propionate and the iron chelator1,10-phenanthroline.

The in vitro disease simulation according to the invention produces apattern of gene expression that strongly resembles dandruff lesionalskin. Affymetrix U133A Plus 2 microarrays were used to evaluate the geneexpression profile elicited by exposure of organotypic human skincultures to a variety of proinflammatory cytokines individually and incombination, as set forth in FIG. 21A and FIG. 21B. Analysis of GeneOntology themes showed that the combination of IL-17A and IL-22 producesa thematic profile that closely resembles that of dandruff lesionalskin, while other cytokines produced thematic profiles that weresignificantly different. Crucially, connectivity mapping exercises usinggene expression signatures derived from dandruff lesional skin (e.g.“lipid-immune” and “dandruff” as set forth in FIG. 1A and FIG. 1B) showthat this in vitro cytokine simulation produces linkages that are amongthe most strongly positively linked of all in the internal databasewhich consists of ˜5000 instances. This strongly supports that thesimulation is producing a gene expression pattern that resembles thedisease condition.

Connectivity mapping could be used to evaluate the sufficiency of otherin vitro or even in vivo disease models in animals, including transgenic animals (knockout, knock-in, etc.). The present investigatorshave demonstrated that by developing gene expression signatures from adisease state, it is possible using connectivity mapping to interrogatehow closely a given disease model mimics the disease state. Bymanipulating model conditions to most closely approximate a diseasestate, predictivity of therapeutic efficacy is expected to bedramatically improved.

VI. Compositions and Personal Care Products

Generally, skin-active agents identified for the treatment of dandruffor dandruff-related skin conditions may be applied in accordance withcosmetic compositions and formulation parameters well-known in the art.Various methods of treatment, application, regulation, or improvementmay utilize the skin care compositions comprising skin-active agentsidentified according to the inventive methods. The composition may beapplied as part of routine hygiene relating to the hair and scalp andmay be formulated as shampoos, conditioners, hair sprays, creams,ointments and the like. The composition may be applied to the scalp totreat dandruff or symptoms of dandruff present in other skin disorders.

U.S. Pat. Nos. 7,101,889; 5,624,666; 6,451,300, 6,974,569, and 7,001,594are non-limiting examples of US patents comprising guidance oncompositions, formulations, vehicles, administration, and other aspectsrelating to personal care products comprising anti-dandruff agentsformulated for the treatment of dandruff. The entire disclosures ofthese patents are incorporated herein by this reference.

EXAMPLES

The present invention will be better understood by reference to thefollowing examples which are offered by way of illustration notlimitation.

Generally Applicable C-Map Methodology

Generating Instances

Individual experiments (referred to as batches) generally comprise 30 to96 samples analyzed using Affymetrix GeneChip® technology platforms,containing 6 replicates of the vehicle control (e.g., DMSO), 2 replicatesamples of a positive control that gives a strong reproducible effect inthe cell type used, and samples of the test material/perturbagen.Replication of the test material is done in separate batches due tobatch effects. In vitro testing was performed in 6-well plates toprovide sufficient RNA for GeneChip® analysis (2-4 μg total RNAyield/well).

Human telomerized keratinocytes (tKC) were obtained from the Universityof Texas, Southwestern Medical Center, Dallas, Tex. tKC cells were grownin EpiLife® media with 1× Human Keratinocyte Growth Supplement(Invitrogen, Carlsbad, Calif.) on collagen I coated cell culture flasksand plates (Becton Dickinson, Franklin Lakes, N.J.). Keratinocytes wereseeded into 6-well plates at 20,000 cells/cm² 24 hours before chemicalexposure. Human skin fibroblasts (BJ cell line from ATCC, Manassas, Va.)were grown in Eagle's Minimal Essential Medium (ATCC) supplemented with10% fetal bovine serum (HyClone, Logan, Utah) in normal cell cultureflasks and plates (Corning, Lowell, Mass.). BJ fibroblasts were seededinto 6-well plates at 12,000 cells/cm² 24 hours before chemicalexposure.

All cells were incubated at 37° C. in a humidified incubator with 5%CO₂. At t=−24 hours cells were trypsinized from T-75 flasks and platedinto 6-well plates in basal growth medium. At t=0 media was removed andreplaced with the appropriate dosing solution as per the experimentaldesign. Dosing solutions were prepared the previous day in sterile 4 mlFalcon snap cap tubes. Pure test materials may be prepared at aconcentration of 1-200 μM, and botanical extracts may be prepared at aconcentration of 0.001 to 1% by weight of the dosing solution. After 6to 24 hours of chemical exposure, cells were viewed and imaged. Thewells were examined with a microscope before cell lysis and RNAisolation to evaluate for morphologic evidence of toxicity. Ifmorphological changes were sufficient to suggest cytotoxicity, a lowerconcentration of the perturbagen was tested. Cells were then lysed with350 μl/well of RLT buffer containing β-mercaptoethanol (Qiagen,Valencia, Calif.), transferred to a 96-well plate, and stored at −20° C.

RNA from cell culture batches was isolated from the RLT buffer usingAgencourt® RNAdvance Tissue-Bind magnetic beads (Beckman Coulter)according to manufacturer's instructions. 1 μg of total RNA per samplewas labeled using Ambion Message Amp™ II Biotin Enhanced kit (AppliedBiosystems Incorporated) according to manufacturer's instructions. Theresultant biotin labeled and fragmented cRNA was hybridized to anAffymetrix HG-U133A 2.0 GeneChip®, which was then washed, stained andscanned using the protocol provided by Affymetrix.

Example 1

Deriving a Dandruff Expression Signature

The samples were analyzed on the Affymetrix HG-U133 Plus 2.0 GeneChips,which contain 54,613 probe sets complementary to the transcripts of morethan 20,000 genes. However, instances in the provided database used werederived from gene expression profiling experiments using AffymetrixHG-U133A 2.0 GeneChips, containing 22,214 probe sets, which are a subsetof those present on the Plus 2.0 GeneChip. Therefore, in developing geneexpression signatures from the clinical data, the probe sets werefiltered for those included in the HG-U133A 2.0 gene chips.

A statistical analysis of the microarray data was performed to derive aplurality of dandruff gene expression signatures which may comprise astatistically relevant number of the up-regulated and down-regulatedgenes. In certain embodiments a dandruff gene expression signatureincludes between 10 and 400 up-regulated and/or between 10 and 400down-regulated genes. In more specific embodiments a dandruff geneexpression signature includes the 70 most statistically relevantup-regulated genes alone or in combination with the 70 moststatistically relevant down-regulated genes. Regulation is determined incomparison to gene expression in normal dandruff-unaffected skin onnon-dandruff subjects.

-   -   a. Filtering According to a Statistical Measure. For example, a        suitable statistical measure may be p-values from a t-test,        ANOVA, correlation coefficient, or other model-based analysis.        As one example, p-values may be chosen as the statistical        measure and a cutoff value of p=0.05 may be chosen. Limiting the        signature list to genes that meet some reasonable cutoff for        statistical significance compared to an appropriate control is        important to allow selection of genes that are characteristic of        the biological state of interest. This is preferable to using a        fold change value, which does not take into account the noise        around the measurements. The t-statistic was used to select the        probe sets in the signatures because it is signed and provides        an indication of the directionality of the gene expression        changes (i.e. up- or down-regulated) as well as statistical        significance.    -   b. Sorting the Probe Sets. All the probe sets are sorted into        sets of up-regulated and down-regulated sets using the        statistical measure. For example, if a t-test was used to        compute p-values, the values (positive and negative) of the        t-statistic are used to sort the list since p-values are always        positive. The sorted t-statistics will place the sets with the        most significant p-values at the top and bottom of the list with        the non-significant ones near the middle.    -   c. Creation of the Gene Expression Signature. Using the filtered        and sorted list created, a suitable number of probe sets from        the top and bottom are selected to create a gene expression        signature that preferably has approximately the same number of        sets chosen from the top as chosen from the bottom. For example,        the gene expression signature created may have at least about        10, 50, 70, 100, 200, or 300 and/or less than about 800, 600,        400 or about 100 genes corresponding to a probe set on the chip.        The number of probe sets approximately corresponds to the number        of genes, but most genes are represented by more than one probe        set. It is understood that the phrase “number of genes” as used        herein, corresponds generally with the phrase “number of probe        sets.”

For dandruff, one exemplary gene expression signature includes the 70most significant up and 70 most significant down-regulated probe setsdetermined from comparing a dandruff-affected skin sample to adandruff-unaffected skin sample, as set forth in Table B, FIG. 1A andFIG. 1B. Another exemplary gene expression signature is derived from thephysiological thematic signature where genes derived from the genecluster associated with one or more significant themes constitute a geneexpression signature. A dandruff gene expression signature reflectingthe lipid-immune/inflammation theme signature is set forth in Table A,FIG. 1A and FIG. 1B.

Example 2

This example illustrates that the complex dandruff condition may berepresented by keratinocyte-based models and screening methods, and thatgene expression profiles from keratinocytes and dandruff gene expressionsignatures can be used to reliably screen for candidate cosmetic agentsfor dandruff. The Example further illustrates the use of the geneexpression profile to determine physiological thematic signatures usefulfor querying C-map to generate potential new skin-active agents anduseful for screening skin active agents for anti-dandruff efficacy.

In accordance with methods of the invention, a broad gene expressionprofile for dandruff constituting the approximately 3,700 most-regulatedgenes was determined from comparing transcription data ofdandruff-affected scalp skin to non-dandruff scalp skin. By analyzingthe gene expression data in terms of Gene Ontology, a physiologicaltheme profile is determined. This Example further illustrates thatanalysis of the Gene Ontology for dandruff when compared to otherdandruff-related conditions yields a highly specific theme pattern.According to the inventive methods, skin-active agents may be screenedfor potential efficacy in the treatment of dandruff by selecting agentswhich act to shift the physiological theme signature toward that ofhealthy skin which signifies restoration of a desired state ofhomeostatic equilibrium characteristic of non-affected skin. The presentinvestigators hypothesize that such an approach to new active discoverywill yield treatments both effective and long-lasting.

To screen for anti-dandruff agents having strong skin activity, a geneexpression signature was selected to comprise a subset of up-regulatedand down-regulated genes representative of lipid metabolism and thoserepresentative of immune/inflammatory response, the two physiologicalthemes constituting the most statistically salient thematic profile fordandruff. It is noted however that a subset of up-regulated anddown-regulated genes representative of hyperproliferation could havealso been used for the gene signature.

This signature was used to query a C-map database comprising geneexpression profiles from fibroblast and keratinocyte cell lines exposedto a large number of different chemicals including the anti-dandruffagents ketoconazole, climbazole, clobetasol propionate, ZPT, andselenium sulfide. Each agent was tested at several concentrations. Asshown in Table E, the highest-ranked results include clobetasolpropionate, which is known to be the most effective anti-dandruff agentwhich acts by triggering strong skin activity. This result validates theeffectiveness of the process. In addition, the highest-ranked resultsalso include the anti-fungal agents ketoconazole and climbazole,suggesting that they may effective in treating dandruff by inducing skineffects, as well as anti-fungal effects. Moreover, that ZPT and seleniumsulfide are not in the list of instances strongly linked to the genesignature suggests that their anti-dandruff properties may be related toother activities not addressed by this thematic signature.

The results shown in Table E also confirm the conclusion that geneexpression profiles from keratinocyte cell lines (a proxy for theepidermis) are useful for screening of candidate cosmetic agents fordandruff. As can be seen, the highest-ranked results are in keratinocytecell lines.

TABLE E Rank Chip ID Chemical Cell Line Concentration Score 1GSS128_Keto_10_24hr-80 Ketoconazole tKC 10 μM −0.72 2GSS128_CB_10_24hr-58 Climbazole tKC 10 μM −0.67 3 GSS128_CP_20_24hr-67Clobetasol tKC 20 μM −0.65 Propionate 4 GSS128_Keto_10_24hr-79Ketoconazole tKC 10 μM −0.63 5 GSS128_CP_10_24hr-66 Clobetasol tKC 10 μM−0.62 Propionate 6 GSS128_CB_20_24hr-60 Climbazole tKC 20 μM −0.61 7GSS128_Keto_1_24hr-78 Ketoconazole tKC  1 μM −0.58 8GSS128_CP_20_24hr-68 Clobetasol tKC 20 μM −0.57 Propionate 9GSS128_CB_1_6hr-16 Climbazole tKC  1 μM −0.55 10GSS106A_cyclosporin_01_tert_keratinocytes Cyclosporin tKC 10 μM −0.54 11GSS128_CP_10_24hr-65 Clobetasol tKC 10 μM −0.54 Propionate 12GSS122_MCF_Cyclosporin_B Cyclosporin MCF7 10 μM −0.53 13GSS106A_triac_01_tert_keratinocytes Triac tKC 10 μM −0.53 14GSS128_CB_20_24hr-59 Climbazole tKC 20 μM −0.53 155202764005789148112904.C05 Rosiglitazone MCF7 10 μM −0.51

In light of the above, it was concluded that the complex dandruffcondition may be represented by keratinocyte-based models and screeningmethods. Moreover, it was determined that C-map, gene expressionprofiles from keratinocytes, and dandruff gene expression signatures canbe used to reliably screen for candidate cosmetic agents for dandruff.Furthermore, it was determined that such screening can be done withoutknowing the mechanisms of action involved in dandruff.

Example 3

This Example illustrates validation of an In Vitro Model of the dandruffcondition according to one embodiment of the present invention and theuse of Thematic Signatures to guide the C-map query for skin-activeagent candidate output.

Gene expression data from five inflammatory skin disorders (acne, atopicdermatitis, dandruff, eczema and psoriasis) were collected from aclinical genomics study and published studies. The raw expression datawere used to produce a rank-ordered list of most differentiallyregulated genes associated with inflammatory skin disorders. This listwas used to construct a gene signature for querying the provideddatabase, the signature comprising the top 70 up-regulated and 70down-regulated genes from the rank-ordered list.

The derived gene signature was used to query a provided databasecomprising gene expression data from clinical genomics studies of awidely different inflammatory skin disorders, published in vitrogenomics studies of disparate inflammatory skin disorders, and genomicsdata from an internal in vitro model of dandruff inflammatory pathology(human organotypic, MatTek, cultures). As shown in Table F, thesignature mapped strongly to the internal model, as well as to clinicalgenomics studies, thereby suggesting that the internal model elicitsgene expression changes that are comparable to what is seen in vivo ininflammatory skin conditions. Thus, the internal model was validated asbeing useful for study of inflammatory cascades and other geneexpression alterations associated with inflammatory skin disorders.

TABLE F Connectivity Map Linkage Scores Using Derived Signature to Querythe Database Cell Up Down Rank Chip ID Treatment Line Conc. Score ScoreScore 9428 GSM173545-IL24 IL24 RHE 20 ng/ml 0.833 0.485 −0.348 9427GSM173544-IL24 IL24 RHE 20 ng/ml 0.829 0.517 −0.313 9426 GSM173537-IL19IL19 RHE 20 ng/ml 0.807 0.505 −0.302 9425 GSM173546-IL24 IL24 RHE 20ng/ml 0.801 0.461 −0.340 9424 GSM173542-IL22 IL22 RHE 20 ng/ml 0.7890.444 −0.345 9423 GSM173535-IL19 IL19 RHE 20 ng/ml 0.749 0.410 −0.3399422 GSM173541-IL22 IL22 RHE 20 ng/ml 0.740 0.410 −0.330 9421GSM173539-IL20 IL20 RHE 20 ng/ml 0.731 0.424 −0.307 9420 GSM173536-IL19IL19 RHE 20 ng/ml 0.729 0.460 −0.269 9419 GSM173556-IL1b IL1b RHE 10ng/ml 0.710 0.420 −0.291 9418 GSM173543-IL22 IL22 RHE 20 ng/ml 0.7050.388 −0.317 9417 GSM173540-IL20 IL20 RHE 20 ng/ml 0.702 0.333 −0.3699416 GSM173538-IL20 IL20 RHE 20 ng/ml 0.668 0.436 −0.232 9415GSM173554-IFNg IFNg RHE 10 ng/ml 0.665 0.420 −0.245 9414 GSM173553-IFNgIFNg RHE 10 ng/ml 0.657 0.431 −0.226 9413 GSM173555-IL1b IL1b RHE 10ng/ml 0.649 0.380 −0.269 9412 GSS157_13 BEAS-2B RV-13 BEAS-2B 0.6020.378 −0.224 9411 GSM305449 HK23/2 IL17 hKC 200 ng/ml  0.579 0.428−0.151 9410 GSM305450 HK23/2 IL22 hKC 200 ng/ml  0.573 0.358 −0.214 9409GSM305448 HK23/2 IFNg hKC 20 ng/ml 0.552 0.450 −0.102 Legend: IL =interleukin; IFN = interferon; RHE = reconstituted human epidermis;BEAS-2b RV-13 = Human Bronchial Epithelial Cells treated withrhinovirus-13; hKC = human keratinocytes

Example 4

This example illustrates application of transcriptional profiling toinvestigate the pathogenesis of dandruff and to determine the mechanismof action of a benchmark anti-dandruff active.

Dandruff (seborrheic dermatitis) is a chronic keratinous condition andinvolves numerous variables and mechanisms, many of which are unknown.It is believed that dandruff has hereditary components and environmentalcomponents (e.g. yeast irritation). Most anti-dandruff research isdirected to anti-fungal properties of agents rather than host-centricproperties (i.e., inducement or reduction in a response in the human).

Dandruff and seborrheic dermatitis are common chronic relapsing scalpskin disorders that share some clinical features in common withpsoriasis and atopic dermatitis. While seborrheic dermatitis can affectsebum-rich area other than scalp, we routinely refer to these conditionson the scalp collectively as “dandruff.” Like psoriasis and atopicdermatitis, the pathogenesis of dandruff is complex, and appears to bethe result of interactions among scalp skin, cutaneous microflora andthe cutaneous immune system. The key clinical features of dandruffinclude flaking and itch, but the understanding of the preciseunderlying events that provoke these symptoms is limited.

Clues, however, have been derived from studies concerning the removal ofMalassezia yeasts by treatment with antifungal drugs; studies involvingtreatment with corticosteroids or coal tar; as well as frominvestigations involving stratum corneum (SC) ultrastructure, and SClipid composition. All of this evidence supports that there is apronounced disruption of epidermal homeostasis that leads to theexcessive scaling prominent in the dandruff condition. For example, thepresence of parakeratosis in SC samples from the dandruff conditionsuggests that hyperproliferation is a feature of the dandruff lesion,and the associated puritis (itch) is possibly the result of inflammationand mast cell degranulation.

Generally, gene expression profiles in for the disease condition arecompared to the gene expression profiles in the non-disease condition todetermine genes differentially regulated in the condition, referred toas the gene expression profile. The profile is analyzed to determine thekey physiological disruptions manifest in the condition. Once aphysiological theme profile is derived for the condition, a C-map may bequeried for perturbagens with strong connectivity to the relevantphysiological themes. The goal is to identify a set of one or moreperturbagens which when applied either alone or in combination to a skinculture, engender a response in the skin culture having a thematicsignature which substantially mimics the thematic signature of thedisease condition. The skin culture may then be used to screen foragents having strong negative connectivity to the thematic signature.The present inventors determined a physiological thematic signature fora dandruff condition, with the broad pattern set forth in FIG. 9. ByGene Ontology analysis and in accordance with Example 3, above, a highlyrelevant thematic signature for dandruff was derived which includes thethemes of lipid metabolism and immune/inflammation in an inverserelationship. That is, in the dandruff condition the thematic signatureincludes a decrease in lipid metabolism with an increase ininflammation. In the following examples, gene expression signaturesaccording to an aspect of the invention are used to investigate the skinresponse for a benchmark anti-dandruff agent, ZPT.

Methods

Two separate studies were performed:

1) 31 healthy male subjects aged 18-75 were divided into two groups of16 “non-dandruff” and 15 “dandruff” subjects, as defined by a publishedflake scoring procedure, adherent scalp flake score (ASFS). Twofull-thickness four-millimeter punch biopsies were obtained from thedandruff subjects, one at an actively flaking site “involved,” and onefrom a non-flaking site “uninvolved.” A single biopsy was collected fromthe non-sufferers at an anatomically comparable site.2) In a double-blinded treatment study, 45 healthy male subjects (30dandruff and 15 non-dandruff as defined by ASFS criteria, aged 18-50years) were enrolled and were shampooed at the clinical site three (3)times a week for three (3) weeks with either a commercially availableanti-dandruff shampoo with 1% ZPT (15 dandruff subjects) or the sameformula without ZPT (15 dandruff and 15 non-dandruff subjects). Fullthickness 2 mm biopsies were collected from all three groups at baselineand end of study. Total RNA was extracted from the biopsies and labeledfor Affymetrix GeneChip® analysis. The synthesized target cRNA washybridized to Affymetrix HG U133A microarrays. Statistically analyzeddata were filtered by significance (p<0.05, Dandruff vs. Non-Dandruff;ZPT vs. vehicle treatments) to identify genes showing an increase ordecrease in expression level, a standard bioinformatics approach.Methodology and Results of Study 1:

FIG. 10 illustrates the differential gene expression observed inDandruff vs. Non-Dandruff for all individuals.

Genome-wide transcriptional profiles were assessed using RNA extractedfrom full thickness scalp biopsies. Target cDNA (from extracted mRNA)was hybridized to Affymetrix U133 Plus 2 microarrays (54,613 probes). Aheat map of normalized expression value (z-score) of significantlydifferentiated genes (3757) in expression between healthy (green) anddandruff (red) samples was generated.

At least one of Affymetrix probe sets for a given gene had a p-value ofa t-test less than 0.05. A signal value of a probe set with the minimump-value was used in the heat map. Looking at the heat map of FIG. 10,the column side bar indicates sample groups and the row side barindicates biological processes in Gene Ontology. The color scheme forthe biological process themes are: Green: Lipid metabolism; Red: ImmuneResponse; Orange: Response to Stimulus; Blue: Epidermis Development;Cyan: Cell Proliferation; Magenta: Apoptosis; Yellow: Others. This colorscheme applies to all heat maps set forth in the Figures herein.

Group averages for the same 3,757 genes as above are reflected in theheat map set forth in FIG. 11. The heat map depicts the averagednormalized expression values of the significantly differentiated genesin expression between dandruff and healthy scalp among each samplegroup. The column side bar indicates sample groups and the row side barindicates biological processes in Gene Ontology. The investigators notethat although dandruff uninvolved (DUI) and noninvolved (ND) clustertogether, many genes involved in immune response are elevated indandruff uninvolved skin (skin sampled from apparently unaffected scalpskin belonging to the same subject from which the dandruff-affected skinis sampled), including those involved in complement, response to stress,pathogens, cell signaling, etc. The broad pattern physiological themeprofile is set forth in FIG. 9.

A heat map depicting differential gene expression with respect to themore specific lipid metabolism/immune & inflammation theme signature forall individuals is set forth in FIG. 12. The columns are the subjectdescriptions, with non-dandruff grouped to the left and Dandruff to theright. The rows represent the Immune/Inflammatory cluster and the LipidMetabolism cluster. Group averages for the data, including the subjectconditions of dandruff-affected, dandruff-uninvolved and non-dandruffare set forth in FIG. 13.

Heat maps depicting the differential expression of genes involved inskin barrier lipid production are set forth as FIG. 14 (Fatty acidsynthesis pathway), FIG. 15 (cholesterol synthesis pathway) and FIG. 16(sphingolipid synthesis pathway). This permits comparison with knownstratum corneum biomarkers for these pathways. As set forth graphicallyin FIG. 17A and FIG. 17B, the biomarker data supports the transcriptomicfindings with regard to the implication of barrier lipids andinflammation in the dandruff condition.

Methodology, Results for Study 2:

The ZPT transcriptomics study design is set forth in FIG. 18. Notably,the study is a double-blind, vehicle-controlled evaluation of the effectof ZPT on scalp gene expression. The subject's hair/scalp was washed bystudy personnel on clinical site 3×/week for 3 weeks. Full thickness 2mm punch biopsies collected at baseline and end of study. Productexposures were provided under conditions known to reliably reduce flakescores, epidermal thickness, itch and histamine, as well assubstantially restore stratum corneum (SC) biomarker profiles.

The effect of ZPT treatment on differential gene expression is set forthfor group averages as FIG. 19. As can clearly be seen by mere visualinspection, ZPT treatment at week 3 of the dandruff condition resultedin an expression shift toward the non-dandruff condition. Thehierarchical clustering of ˜3700 significantly altered genes shows thatunder conditions known to substantially resolve key symptoms, ZPT causeda dramatic change resulting in a broad profile that resembles healthyscalp.

FIG. 20 illustrates that the same result is seen where the geneexpression signature is derived from the lipid/inflammation thematicmodel of the dandruff condition. Both for the lipid metabolism clusterand the immune/inflammation cluster, a dramatic shift toward the healthyscalp condition is suggested by inspection of the heat map.

The present investigators discovered through transcriptomic profiling ofdandruff, dramatic alterations in a number of physiological processes,most notably an inverse thematic relationship between lipid metabolismand inflammation. Notably, the studies also show that genes involved inimmune function/inflammation were statistically over-represented in theup-regulated category in a comparison of dandruff uninvolved skin andnormal scalp, suggesting the existence of predisposing factors relatedto inflammation. The gene expression changes noted in the dandruffprofile were substantially consistent at the phenotypic level (proteinsand SC lipids). Treatment with a ZPT containing shampoo, but not thecontrol without the active, was able to restore a transcriptomic profilethat resembled that of healthy scalp skin (as shown by hierarchicalclustering analysis).

The dimensions and values disclosed herein are not to be understood asbeing strictly limited to the exact numerical values recited. Instead,unless otherwise specified, each such dimension is intended to mean boththe recited value and a functionally equivalent range surrounding thatvalue. For example, a dimension disclosed as “40 mm” is intended to mean“about 40 mm.”

Every document cited herein, including any cross referenced or relatedpatent or application, is hereby incorporated herein by reference in itsentirety unless expressly excluded or otherwise limited. The citation ofany document is not an admission that it is prior art with respect toany invention disclosed or claimed herein or that it alone, or in anycombination with any other reference or references, teaches, suggests ordiscloses any such invention. Further, to the extent that any meaning ordefinition of a term in this document conflicts with any meaning ordefinition of the same term in a document incorporated by reference, themeaning or definition assigned to that term in this document shallgovern.

While particular embodiments of the present invention have beenillustrated and described, it would be obvious to those skilled in theart that various other changes and modifications can be made withoutdeparting from the spirit and scope of the invention. It is thereforeintended to cover in the appended claims all such changes andmodifications that are within the scope of this invention.

What is claimed is:
 1. A method for constructing a data architecture foruse in identifying connections between perturbagens and genes associatedwith dandruff, and preparing a dandruff care composition comprising: (a)providing a gene expression profile for a control human epidermalkeratinocyte cell; (b) generating a gene expression profile for a humanepidermal keratinocyte cell exposed to at least one perturbagen, byextracting a biological sample from the treated cell and subjecting thebiological sample to microarray analysis via a microarray scannerwherein generating the gene expression profile in at least one of (a)and (b) comprises i. isolating RNA from the human epidermal keratinocytecell, and ii. creating cDNA from the isolated RNA; (c) identifying genesdifferentially expressed in response to the at least one perturbagen bycomparing the gene expression profiles of (a) and (b); (d) creating anordered list comprising identifiers representing the differentiallyexpressed genes, wherein the identifiers are ordered according to thedifferential expression of the genes; (e) storing the ordered list as akeratinocyte instance on at least one computer readable medium; and (f)constructing a data architecture of stored keratinocyte instances byrepeating (a) through (e), wherein the at least one perturbagen of step(a) is different qualitatively or quantitatively for each keratinocyteinstance (g) querying the data architectures of stored keratinocyteinstances with at least one dandruff gene expression signature, whereinquerying comprises comparing the at least one dandruff gene expressionsignature to each stored keratinocyte instance, wherein the dandruffgene expression represents genes differentially expressed in associationwith dandruff; (h) assigning a connectivity score to each of theinstances; and (i) preparing a dandruff care composition comparing atleast one perturbagen, wherein the connectivity score of the instanceassociated with the at least one perturbagen has a negative correlation.2. A method according to claim 1, comprising using a programmablecomputer to perform one or more of steps (c), (d), (e) and (f).
 3. Amethod according to claim 1, wherein the ordered list comprises theordered list of identifiers in association with a numerical ranking forthe identifier corresponding to its rank in the ordered list.
 4. Amethod according to claim 1, wherein the biological sample comprisesmRNA.
 5. A method according to claim 1, wherein the microarray is aglobal microarray or a specific microarray, wherein the specificmicroarray comprises oligonucleotides which hybridize to genescorresponding to a gene expression signature for a cellular phenotype.6. A method according to claim 1, wherein the step of constructing thedata architecture of stored keratinocyte instances by repeating steps(a) through (e) comprises repeating steps (a) through (e) for betweenabout 50 and about 50,000 instances.
 7. A method according to claim 6,wherein the step of constructing the data architecture of storedkeratinocyte instances comprises repeating steps (a) through (e) forbetween about 1000 and about 20,000 instances.
 8. A method according toclaim 1, wherein the at least one perturbagen is an anti-dandruff agent.9. A method according to claim 8, wherein the anti-dandruff agentinduces a host response to produce a host effect, or induces anti-fungalactivity to produce an anti-fungal effect, or both.
 10. The methodaccording to claim 9, wherein the anti-dandruff agent induces a hostresponse to produce a host effect.
 11. The method according to claim 10,wherein the host response is restoration of epidermal homeostasispresent in healthy scalp skin.
 12. The method according to claim 11,wherein restoration of epidermal homeostasis is assessed by measuring ashift in a transcriptional profile derived from scalp skin of the hosttoward a transcriptional profile of healthy scalp skin.
 13. A methodaccording to claim 1, wherein the identifiers are selected from thegroup consisting of gene names, gene symbols, microarray probe set IDvalues, and combinations thereof.
 14. A method according to claim 1,wherein the ordered list is arranged so that an identifier associatedwith a most up-regulated gene is positioned at the top of the orderedlist and an identifier associated with a most down-regulated gene ispositioned at the bottom of the ordered list.
 15. A method according toclaim 14, wherein the ordered list of each keratinocyte instance isarranged so that an identifier associated with each gene that is notdifferentially expressed is positioned between the identifier associatedwith the most up-regulated gene and the identifier associated with themost down-regulated gene.
 16. A method according to claim 1, whereineach keratinocyte instance comprises between about 1,000 and about50,000 identifiers.
 17. A method according to claim 1, wherein eachkeratinocyte instance comprises metadata for the at least oneperturbagen associated with the instance.
 18. A method according toclaim 1, wherein at least one perturbagen is an anti-fungal agent.
 19. Amethod according to claim 18, wherein an anti-fungal agent compriseszinc pyrithione (ZPT), selenium sulfide or both.
 20. A method accordingto claim 1, wherein at least one perturbagen comprises an environmentalstimuli.
 21. The method according to claim 10 wherein the host responsecomprises one or more of inducing lipid metabolism, suppressinginflammation, suppressing cell proliferation, suppressing cell apoptosisand normalizing cell differentiation.
 22. The method according to claim21, wherein the host response comprises inducing lipid metabolism andsuppressing inflammation.